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5^ (54) Title: NOVEL POLYPEPTIDES AND NUCLEIC ACIDS ENCODING SAME 

2 (57) Abstract: The present invention provides novel isolated NOVX polynucleotides and polypeptides encoded by the NOVX 
polynucleotides. Also provided are the antibodies that immunospecifically bind to a NOVX polypeptide or any derivative, variant, 
O mutant or fiagment of the NOVX polypeptide, polynucleotide or antibody. The invention additionally provides methods in which the 
^ NOVX polypeptide, polynucleotide and antibody are utilized in the detection and treWment of a broad range of pathological states, 
1^ as well as to other uses. \ 
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Novel Polypeptides and Nucleic acids Encoding Same 



Background of the Invention 

The inveiition relates generally to nucleic adds and polypeptides. 

Summary of the Invention 

The present invention is based, in part, xtpoai the discovery of novel human nucleic acid 
sequences encoding polypeptides. The NOV-X nucleic adds, polynucleotides, piotdns, and 
polypeptides or ftagments thereof described herein collectively include NOV-1, N0V-2a, and 
N0V-2b, which are novel KIAA1233-like polypeptides; N0V-3a, N0V-3b, N0V-3c, and 
N0V-3d, which are novel STE20-like polypeptides; N0V:4a, ITOV:^ N0V:4c, N0Y:4d, 
jgdN0y-4e, which are novel trypsin inhibitor-like polypeptid^^^ 

In one aspect, the invention uicludes an isolated NOV-X nucldc acid molecule which 
includes a nucleotide sequence encoding a polypq)tide that mcludes the amino acid sequence 
of SEQ ID NO: 2, 4, 5, 7, 9, 11. 13, 15, 17, 19. 21, or 23. For example, in various 
embodiments, flie nucleic acid can include a nucleotide sequence that includes SEQ ID NO: 1, 
3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57. Alternatively, the encoded NOV-X polypeptide may 
have a variant amino acid sequence, e.g., have an identity or similarity less than 100% to flxe 
disclosed anoino acid sequences, as described herein. 

The invention also includes an isolated polypeptide that includes the amino add 
sequence of SEQ ID NO: 2, 4, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, or 23, or a fragment having at 
least 6 amino acids of these amino acid sequences. Also included is a naturally occurring 
polypeptide variant of a NOV-X polypeptide, wherein the polypeptide is encoded by a nucleic 
acid molecule which hybridizes under stringent conditions to a nucleic acid molecule 
consisting of a NOV-X nucleic acid molecule. 

Also included in the invention is an antibody that selectively binds to a NOV-X 
polypeptide. The antibody is preferably a monoclonal antibody, and most preferably is a 
human antibody. Such antibodies are useful, for example, in tiie treatment of a pathological 
state in a subject wherein the treatment includes administering the antibody to the subject. 
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The invention further includes a method for pioducing a NOV-X polypeptide by 
cultuxing a host cell expressing one of the herein described NOV-X nucleic acids under 
conditions in which the nucleic acid molecule is expressed. 

The invention also includes methods for detecting the presence of a NOV-X 
polypeptide or nucleic acid in a sanq)le fiom a mammal, eg., a humao, by contacting a sample 
ftom the mammal with an antibody which selectively binds to one of the herein described 
polypeptides, and detecting the formation of reaction con5)lexes including the antibody and 
the polypeptide in the sample. Detecting the fonnation of complexes in the sarople indicates 
flie presence of tiie polypeptide in the sample. 

The invention further includes a method for detecting or diagnosmg the presence of a 
disease, e.g., a pathological condition, associated with altered levels of a polypeptide having 
an amino acid sequence at least 80% identical to a NOV-X polypeptide in a sample. The 
method includes measuring the level of the polypeptide in a biological sample fiom the 
mammalian subject, e.g., a human, and comparing the level detected to a level of the 
polypeptide present in normal subjects, or in the same subject at a different time, e.g., prior to 
onset of a condition. An increase or decrease in flie level of the polypeptide as con^ared to 
normal levels indicates a disease condition. 

Also included in the invention is a method of detecting the presence of a NOV-X 
nucleic acid molecule in a sample, from a mammal, a human. The method includes 
contacting the sample with a nucleic acid probe or primer which selectively hybridizes to the 
nucleic acid molecule and determining whether the nucleic acid probe or primer binds to a 
nucleic acid molecule in the sample. Binding of the nucleic acid probe or primer indicates the 
nucleic acid molecule is present in the sample. 

The invention further includes a method for detecting or diagnosmg the presence of a 
disease associated with altered levels of a NOV-X nucleic acid in a sample fiom a mammal, 
e.g, . a human. The method includes measuring the level of the nucleic acid in a biological 
sample fiom the mammalian subject and comparing the level detected to a level of the nucleic 
acid present in normal subjects, or in the same subject at a different time. An increase or 
decrease in the level of the nucleic acid as compared to normal levels indicates a disease 
condition. 
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The inventioii also includes a method of treatiiig a pathological state in a mammal^ e.g, 
a human, by administering to the subject a NOV-X polypeptide to the subj ect in an amount 
sufficient to alleviate the pathological condition. The polypeptide has an amino acid sequence 
at least 80% identical to a NOV-X polypeptide. 

Alternatively, the mammal may be treated by administering an antibody as herein 
described in an amount sufficient to alleviate the pathological condition. 

Pathological states for which the methods of treatment of the invention are envisioned 
include hematopoietic, immunological, tumor, cancer, neurodegenerative (e.g, Alzheuner's 
and Parkinson's disease) and fertility disorders. 

Unless otherwise defined, all technical and scientific terms used herein have the same 
meaning as commonly understood by one of ordinary skill in the art to which this invention 
belongs. Although methods and materials similar or equivalent to those described herein can 
be used in the practice or testfaig of the present invention, suitable methods and materials are 
described below. All pubUcations, patent qjplications, patents, and other references 
mentioned herein are hxcoiporated by reference in flieu: entirety. In ttie case of conffict, the 
present specification, including definitions, will control In addition, the materials, methods, 
and exanq)les are illustrative only and not intended to be limiting. 

Other features and advantages of the invention will be ^parent ftom the following 
detailed description and claims. 

Detailed Description of the Invention 

The present invention is based, in part, \xpon the discovery of novel human nucleic add 
sequences and of polypeptides encoded by ttiese nucleic acids. The nucleic acids have been 
named *^0V 1-4*', or collectively, *NOV-X", Representative NOV-X sequences, and 
rq>resentative examples of uses of these sequences, are briefly discussed below. 

Table 1 provides a summary of the NOV-X nucleic acids, their encoded polypeptides 
and homology. 



TABLE 1. Sequences and Corresponding SEP ID Numbers 



NOVX 
Assignment 


Internal 
Identification 


SEQm 

NO 
(nucleic 

add) 


SEQroNO 
(polypeptide) 


Homology 


1 


10132038.0.67 


1 


2 


KIAA1233 protein 


2a 


10132038.0.139 


3 


4 


KIAA1233 protein 


2b 


10132038.0.136 


57 


5 


KIAA1233 protein 



3- 
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la 


1 0<d«}£ PVi*! 


6 


7 


STB20 protem Idnase 


11* 
3D 


loODZJoO xiAlz 


o 

8 


9 


STE20 protein kinase 


jC 


iK^^Zjoo Ha. 13 


10 


11 


S1E20 protein kinase 


1A 

JQ 




12 


13 


STE20 piotem kinase 






l*r 


\j 


xiypsm miuDitor 


4b 


10093872,1 


16 


17 


Tiypsin inhibitor 


4c 


10093872.0.38 


18 


19 


Trypsin inhibitor 


4d 


10093872.2 


20 


21 


Trypsin inhibitor 


4e 


10093872.3 


22 


23 


Trypsin inhibitor 



NOV-1: A Novel KIAA1233-like Polypeptide 

A NOV-1 sequence according to the invention is a nucleotide sequence encoding a 
polypeptide related to KIAA1233 proteins, which bear sequence similarity to lacunin, 
5 thrombospondins, proteinases, semq)horins, ADAM-TS, and properdin fanuly members. This 
invegntion maps to Unigene cluster Hs. 1 8705. This cluster has been m^ped to Chromosome 
15 Marker stSG35204, Interval D15S1 15-D15S152. By integrating information fiom the 
Online Mendelian Inheritance in Man (OMIM), this region is identified as 15q22- 
qter. Therefore, the chromosomal location of the invention is Quromosome 15 Marker 

10 stSG35204, Interval D15S1 15-D15S152, which corresponds to 15q22-qter. 

The nucleic acid of the invention, NOV-1, encoding a KIAA1233-like protein 
originating fix)m chromosome 15, is shown in TABLE 2. The disclosed nucleic acid (SEQ ID 
NO: 1) is a full-length clone of 1281 nucleotides and contains an open reading frame (ORF) 
that begins with an ATG initiation codon at nucleotide 416 and ends with a TAA stop codon at 

15 nucleotides 4259. A representative ORF encodes a 1281 amino acid polypeptide (SEQ ID 
NO: 2). The initiation and stop codons of SEQ ID NO: 1 are shown in bold font. Putative 
untranslated regions are upstream of the initiation codon and downstream of the stop codon in 
SEQ ID NO: 1. 



20 

TABLE2 

TAATAGAGACCTTTCAAAGGACAAATTCTGTGAAATAAAGTGGTTTTCTGAAGAGCCTAC 
TAATAGGACAGTGTGTTAATATCACTAATAAGAGAGTAATGATTATAAAAAGGAATAAAT 
TTATTGAAATTGCAAGATACTTTTCTCCTTTGATTAATATACTGCTAGTTTAGTTTTCTA 

25 CATTTTCAAATAGAACTGGGGAATTTGTGTCGTAGATATTCTTGACAACTAAAGAGATGG 
TGGCTGAATTTTTGGGAATGGTTGATAACACTTGATATTTTTAGTTTCCAATTTGGAAGA 
GCTCTGTCTCTTGGGATGTCAAATATTATATTCGTCAATTAATG7VATGTGTTAATTTATT 
ATAGAAATGATATTCTCACAATGATTTCATTTGTAGTGATGGATTTAAAGAGATAATGCC 
CTATGACCACTTCCAACCTCTTCCTCGCTGGGAACATAATCCTTGGACTGCATGTTCCGT 

30 GTCCTGTGGAGGAGGGATTCAGAGACGGAGCTTTGTGTGTGTAGAGGAATCCATGCATGG 
AGAGATATTGCAGGTGGAAGAATGGAAGTGCATGTACGCACCCAAACCCAAGGTTATGCA 
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AACTTGTAATCTGTTTGATTGCCCCAAGTGGATTGCCATGGAGTGGTCTCAGTGCACAGT 
GACTTGTGGCCGAGGGTTACGGTACCGGGTTGTTCTGTGTATTAACCACCGCGGAGAGCA 
TGTTGGGGGCTGCAATCCACAACTGAAGTTACACATCAAAGAAGAATGTGTCATTCCCAT 
CCCGTGTTATAAACCAAAAGAAAAAAGTCCAGTGGAAGCAAAATTGCCTTGGCTGAAACA 
5 AGCACAAGAACTAGAAGAGACCAGAATAGCAACAGAAGAACCAACGTTC^^ 

CTGGTCAGCCTGCAGTACCACGTGTGGGCCGGGTGTGCAGGTCCGTGAGGTGAAGTGCCG 
TGTGCTCCTCACATTCACGCAGACTGAGACTGAGCTGCCCGAGGAAGAGTGTGAAGGCCC 
CAAGCTGCCCACCGAACGGCCCTGCCTCCTGGAAGCATGTGATGAGAGCCCGGCCTCCCG 
AGAGCTAGACATCCCTCTCCCTGAGGACAGTGAGACGACTTACGACTGGGAGTACGCTGG 

1 0 GTTCACCCCTTGCACAGCAACATGCGTGGGAGGCCATC7VAGAAGCCATAGCAGTGTGCTT 
ACATATCCAGACCCAGCAGACAGTCAATGACAGCTTGTGXGATATGGTCCACCGTCCTCC 
AGCCATGAGCCAGGCCTGTAACACAGAGCCCTGTCCCCCCAGGTGGCATGTGGGCTCrTG 
GGGGCCCTGCTCAGCTACCTGTGGAGTTGGAATTCAGACCCGAGATGTGTACTGCCTGCA 
CCCAGGGGAGACCCCTGCCCCTCCTGAGGAGTGCCGA6ATGAAAAGCCCCATGCTTTACA 

1 5 AGCATGCAATCAGTTTGACTGCCCTCCTGGCTGGCACATTGAAGAATGGCAGCAGTGTTC 
CAGGACTTGTGGCGGGGGAACTCAGAACAGAAGAGTCACCTGTCGGCAGCTGCTAACGGA 
TGGCAGCTTTTTGAATCTCTCAGATGAATTGTGCCAAGGACCCAAGGCATCGTCTCACAA 
GTCCTGTGCCAGGACAGACTGTCCTCCAGATTTAGCrGTGGGAGACTGGTCGAAGTGTTC 
TGTCAGTTGTGGTGTTGGAATCCAGAGAAGAAAGCAGGTGTGTCAAAGGCTGGCAGCCAA 

20 AGGTCGGCGCATCCCCCTCAGTGAGATGATGTGCAGGGATCTACCAGGGCTCCCTCTTGT 
AAGATCTTGCCAGATGCCTGAGTGCAGTAAAATCAAATCAGAGATGAAGACAAAACTTGG 
TGAGCAGGGTCCGCAGATCCTCAGTGTCCAGAGAGTCTACATTCAGACAAGGGAAGAGAA 
GCGTATTAACCTGACCATTGGTAGCAGAGCCTATTTGCTGCCCTiLACACATCCGTGATTAT 
TAAGTGCCCAGTGCGACGATTCCAGAAATCTCTGATCCAGTGGGAGAAGGATGGCCGTTG 

25 CCTGCAGAACTCCAAACGGCTTGGCATCACCAAGTCAGGCTCACTAAAAATCCACGGTCT 
TGCTGCCCCCGACATCGGCGTGTACC6GTGCATTGCAGGCTCTGCACAGGAAACAGTTGT 
GCTCAAGCTCATTGGTACTGACAACCGGCTCATCGCACGCCCAGCCCTCAGGGAGCCTAT 
GAGGGAATATCCTGGGATGGACCACAGCGAA6CCAATAGTTTGGGAGTCACATGGCACAA 
AATGAGGCAAATGTGGAATAACAAAAATGACCTTTATCTGGATGATGACCACATTAGTAA 

30 CCAGCCTTTCTTGAGAGCTCTGTTAGGCCACTGCAGCAATTCTGCAGGAAGCACCAACTC 
CTGGGAGTTGAAGAATAAGCAGTTTGAAGCAGCAGTTAAACAAGGAGCATATAGCATGGA 
TACAGCCCAGTTTGATGAGCTGATAAGAAACATGAGTCAGCTCATGGAAACCGGAGAGGT 
CAGCGATGATCTTGCGTCCCAGCTGATATATCAGCTGGTGGCCGAATTAGCCAAGGCACA 
GCCAACACACATGCAGTGGCGGGGCATCCAGGAAGAGACACCTCCTGCTGCTCAGCTC^^ 

35 AGGGGAAACAGGGAGTGTGTCCCAAAGCTCGCATGCAAZU^CTCAGGCAAGCTGACATT 
CAAGCCGAAAGGACCTGTTCTCATGAGGCAAAGCCT^CCTCCCTCAATTTCATTTAATAA 
AACAATAAATTCCAGGATTGGAAATACAGTATACATTACAAAAAGGACAGAGGTCATCA^ 
TATACTGTGTGACCTTATTACCCCCAGTGAGGCCACATATACATGGACCAAGGATGGAAC 
CTTGTTACAGCCCTCAGTAAAAATAATTTTGGATGGAACTGGGAAGATACAGATACAGAA 

40 TCCTACAAGGAAAGAACAAGGCATATATGAATGTTCTGTAGCTAATCATCTTGGTTCAGA 
TGTGGAAAGTTCTTCTGTGCTGTATGCAGAGGCACCTGTCATCTTGTCTGTTGAAAGAAA 
TATCACCAAACCAGAGCACAACCATCTGTCTGTTGTGGTTGGAGGCATCGTGGAGGCAGC 

5 
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CCTTGGAGCAAACGTGACAATCCGATGTCCTGTAAAAGGTGTCCCTCAGCCTAATATAAC 

TTGGTTGAAGAGAGGAGGATCTCTGAGTGGCAATGTTTCCTTGCTTTTCAATGGATCCCT 

GTTGTTGCAGAATGTTTCCCTTGAAAATGAAGGAACCTACGTCTGGATAGCCACCAATGC 

TCTTGGAAAGGCAGTGGCAACATCTGTACTCCACTTGCTGGAACGAAGATGGCCAGAGAG 

TAGAATCGTATTTCTGCAAGGACATAAAAAGTACATTCTCCAGGCAACCAACACTAG^ 

CAACAGCAATGACCCAACAGGAGAACCCCCGCCTCAAGAGCCTTTTTGGGAGCCTGGTAA 

CTGGTCACATTGTTCTGCCACCTGTGGTCATTTGGGAGCCCGCATTCAGAGACCCCAGTG 

TGTGATGGCCAATGGGCAGGAAGTGAGTGAGGCCCTGTGTGATCACCTCCAGAAGCCACT 

GGCTGGGTTTGAGCCCTGTAACATCCGGGACTGCCCAGCGAGGTGGTTCACAAGTGTGTG 

GTCACAGTGCTCTGTGTCTTGCGGTGAAGGATACCACAGTCGGCAGGTGACGXGCAAGCG 

GACAAAAGCCAATGGAACTGTGCAGGTGGTGTCTCCAAGAGCATGTGCCCCTAAAGACCG 

GCCTCTGGGAAGAAAACCATGTTTTGGTCATCCATGTGTTCAGTGGGAACCAGGGAACCG 

GTGTCCTGGACGTTGCATGGGCCGTGCTGTGAGGATGCAGCAGCGTCACACAGCTTGTCA 

ACACAACAGCTCTGACTCCAACTGTGATGACAGAAAGAGACCCACCTTAAGAAGGAACTG 

CACATCAGGGGCCTGTGATGTGTGTTGGCACACAGGCCCTTGGAAGCCCTGTACAGCAGC 

CTGTGGCAGGGGTTTCCAGTCTCGGAAAGTCGACTGTATCCACACAAGGAGTTGCAAACC 

TGTGGCCAAGAGACACTGTGTACAGAAAAAGAAACCAATTTCCTGGCGGCACTGTCTTGG 

GCCCTCCTGTGATAGAGACTGCACAGACACAACTCACTACTGTATGTTTGTAAAACATCT 

TAATTTGTGTTCTCTAGACCGCTACAAACAAAGGTGCTGCCAGTCATGTCAAGAGGGATA 

AACCTTTGGAGGGGTCATGATGCTGCTGTGAA6ATAAAAGTAGAATATAAAAGCTCTTTT 

CCCCATGTCGCTGATTCAAAAACATGTATTTCTTAAAAGACTAGATTCTATGGATCAAAC 

AGAGGTTGATGCAAAAACACCACTGTTAAGGTGTAAAGTGAAATTTTCCAAT6GTAGTTT 

TATATTCCAATTTTTTAAAATGATGTATTCAAGGATGAACAAAATACTATAGCATGCATG 

CCACTGCACTTGGGACCTCATCATGTCAGTTGAATCGAGAAATCACCAAGATTATGAGTG 

CATCCTCACGTGCTGCCTCTTTCCTGTGATATGTAGACTAGCACAGAGTGGTACATCCTA 

AAAACTTGGGAAACACAGCAACCCATGACTTCCTCTTCTCTCAAGTTGCAGGTTTTCAAC 

AGTTTTATAAGGTATTTGCATTTTAGAAGCTCTGGCCAGTAGTTGTTAAGATGTTGGCAT 

TAATGGCATTTTCATAGATCCTTGGTTTAGTCTGTGAAZVAAGAAACCATCTCTCTGGATA 

GGCTGTCACACTGACTGACCTAAGGGTTCATGGAAGCATGGCATCTTGTCCTTGCTTTTA 

GAACACCCATGGAAGAA7VACACAGAGTAGATATTGCTGTCATTTATACAACTACAGAAAT 

TTATCTATGACCTAATGAGGCATCTCGGAAGTCAAAGAAGAGGGAAAGTTAACCTTTTCT 

ACTGATTTCGTAGTATATTCAGAGCTTTCTTTTAAGAGCTGTGAATGAAACTTTTTCTAA 

GCACTATTCTATTGCACACAAACAGAAAACCAAAGCCTTATTAGACCTAATTTATGCATA 

AAGTAGTATTCCTGAGAACTTTATTTTGGAAAATTTATAAGAAAGTAATCCAAATAAGAA 

ACACGATAGTTGAAAATAATTTTTATAGTAAATAATTGTTTTGGGCTGATTTTTCAGTAA 

ATCCTU^GTGACTTAGGTTAGAAGTTACACTAAGGACCAGGGGTTGGAATCAGAATTTAG 

TTTAAGATTTGAGGAAAAGGGTAAGGGTTAGTTTCAGTTTTAGGATTAGAGCTAGAATTG 

GGTTAGGTGAGAAAGAAAGTTAAGGTTAAGGCTAGAGTTGTCTTTAAGGGTTAGGGTTAG 

GACCAGGTTAGGTCAGGGTTGGATTGGGTTTAGATTGGGGCCAGTGCTGGTGTTAGTGAT 

AGTGTCAGGATGGAGGTTAGGTTTGGAGTAAGCGTTGTTGCTGAAGTGAGTTCAGGCTAG 

CATTA7VATTGTAAGTTCTGAAGCTGATTTGGTTATGGGGTCTTTCCCCTGTATACTACCA 

GTTGTGTCTTTAGATGGCACACAAGTCCAAATAAGTGGTCATACTTCTTTATTCAGGGTC 
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TCAGCTGCCTGTACACCTGCTGCCTACATCTTCTTGGCAACAAAGTTACCTGCCACAGGC 
TCTGCTGAGCCTAGTTCCTGGTCAGTAATAACTGAACAGTGCATTTTGGCTTTGGATGTG 
, TCTGTGGACAAGCTTGCTGAGTTTCTCTACCATATTCTGAGC7VCACGGTCTCTTTTGTTC 
TAATTTCAGCTTCACTGACACTGGGTTGAGCACTACTGTATGTGGAGGGTTTGGTGATTG 
5 GGAATGGATGGGGGACAGTGAGGAGGACACACCAGCCCATTAGTTGTTAATCATCAATCA 
CATCTGATTGTTGAAGGTTATTAAATTAAAAGATIAGATCATTTGTAACATACTCTTTGTA 
TATATTTATTATATGAAAGGTGCAATATTTTATTTTGTACAGTATGTAATAAAGACATGG 
GACATATATTTTTCTTATTAACAAAATTTCATATTAAATTGCTTCACTTTGTATTTAAAG 
TTAAAAGTTACTATTTTTCATTTGCTATTGTACTTTCATTGTTGTCATTCAATTGACATT 
10 CCTGTGTACTGTATTTTACTACTGTTTTTATAACATGAGAGTTAATGTTTCTGTTTCATG 

ATCCTTATGTAATTCA6AAATAAATTTACTTTGATTATTCAGTGGCATCCTTAT (SEQ ID NO: 1) 



mpydhfqplprwehnpwtacsvscgggiqrrsfvcx^esmhgeilqveewkcmyapkpkvmqtcnlfdcpkw 
wsqctvtcgrglryrvvlcinhrgehvggcnpqlklhibceecvipipcykpbcekspveaklpwlkqaqeleetri^ 

15 teeptfipepwsacsttcgpgvqvrevkcrvlltftqtetelpeeecegpklpterpclleacdespasreldip^ 
pedsettydweyagftpctatcvgghqeaiavclhiqtqqtvndslcdmvhrppamsqacntepcpprwhvgswgp 
csatcgvgiqtrdvyclhpgetpappeecrdekphalqacnqfdcppgwhieewqqcsrtcgggtqnrrvtcrqiil 
tixssflnlsdelcqgpkasshkscy^tdcpphiiavgdwskcsvscgvgiqrrkqvcqriaakgrr^ 
pglplvrscqmpecskiksemktklgeqgpqilsvqrvyiqtreekrinltigsrayllpntsviikcpvrrfqks 

20 liqwekdgrclqnskrlgitksgslkihglaapdigvyrciagsaqetvvlkligtdnrliarpalr^ 
dhseanslgvtwhkmrqmwnnkndlyldddhisnqpflrallghcsnsagstnswelknkqe^ 
qfdelirnmsqllffitgevsddlasqliyqlyaeiakaqpthmqwrgiqeetppaaqlrgetgsvsqs 
tfkpkgpvim<qsqppsisfnktinsrigntvyitkrtevinilcdlitpseatytwtkdgti^ 
kiqiqnptrkeqgiyecsvanhrgsdvesssvlyaeapvilsvernitkpehnhlsvwggiveaalga^ 

25 vkgvpqpnitwlkrggslsgnvsllfngslllqnvslenegtyvciatkalgbcavatsvlhllerrwpe 

gheckyilqatntrtnsndptgepppqepfwepgnwshcsatcghlgariqrpqcvmangqevsealcdhlqkpla^ 
fepcnirix:parwftswsqcsvscgegyhsrqvtckrtbcangtvqwspracapkdrplgr^ 
nrcpgrcmgravrmqqrhtacqhnssdsncddrkrptlrrnctsgacdvothtgpwkpctaacgrgfqsrkv^ 
trsckpvakrhcvqkkkpisvmhclgpscdrdctdtthycmfvkhlnlcsldrykqrccqscqeg (seq id 

30 NO: 2) 



la a search of sequence databases, it was found, for example, that the disclosed NOV-1 
nucleotide sequence has 5106 of 5107 bases (99%) identical to a human mRNA for a 
KIAA1233 protein (SECR) (GenBank Accession No: ABO33059), as shown in Table 3. In all 
35 sequence alignments, identical residues are depicted as "|"- As indicated by the **E3q)ect" 
value, the probability of this alignment occurring by chance alone is 0.0, the lowest 
probability. 

Furthermore, the encoded amino acid sequence has 1023 of 1023 amino acid residues 
(100%) identical to, and 1023 of 1023 residues (100 %) positive with, a 1023 ammo acid 

7 
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residue human iaAA1233 protein (GenBank Accession No: BAA86547), as shown in Table 
4. As indicated by the **E}q)ecf ' value, the probability of this alignment occuring by chance 
alone is 0, the lowest probability. 



10 



15 



TABLE 3 

Score = 1 ,0126+04 bits (5103), Expect =0.0 
Identities = 5106/5107 (99%) 
Strand = Plus / Plus 



NOVl: 
1247 

SECR 



1188 tagcagtgtgcttacatatccagacccagcagacagtcaatgacagcttgtgtgatatgg 

IIIIIIIIMIMIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIMIIII 
1 tagcagtgtgcttacatatccagacccagcagacagtcaatgacagcttgtgtgatatgg 60 



NOVl: 
1307 



1248 tccaccgtcctccagccatgagccaggcctgtaacacagagccctgtccccccaggtggc 



1 1 M M M M I M I M M I M M M I M 1 1 1 1 M I M I 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 
20 SECR : 61 tccaccgtcctccagccatgagccaggcctgtaacacagagccctgtccccccaggtggc 120 



NOVl: 
1367 



1308 atgtgggctcttgggggccctgctcagctacctgtggagttggaattcagacccgagatg 



25 I i I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

SECR : 121 atgtgggctcttgggggccctgctcagctacctgtggagttggaattcagacccgagatg 



180 



NOVl : 1368 tgtactgcctgcacccaggggagacccctgcccctcctgaggagtgccgagatgaaaagc 
30 1427 

lllllllllllllllllllllllillllllllllllllllllllllllllllllllllll 
SECR : 181 tgtactgcctgcacccaggggagacccctgcccctcctgaggagtgccgagatgaaaagc 



240 



35 



40 



45 



NOVl : 1428 cccatgctttacaagcatgcaatcagtttgactgccctcctggctggcacattgaagaat 
1487 

lllllllllllllllllllllillilllllllllllllllllllllllllllllllllll 
SECR : 241 cccatgcttt acaagcat gcaat cagttt gactgccct cctggct ggcacatt gaagaat 

NOVl : 1488 ggcagcagtgttccaggacttgtggcgggggaactcagaacagaagagtcacctgtcggc 
1547 

lllllllllillllllllllllllllllllllllllllllllllllllllllllllllll 
SECR : 301 ggcagcagtgttccaggacttgtggcgggggaactcagaacagaagagtcacctgtcggc 



300 



360 



50 



NOVl: 
1607 

SECR 



1548 agctgctaacggatggcagctttttgaatctctcagatgaattgtgccaaggacccaagg 

IIIMMMIIIIIIIMIIIIIIIIIIIIIIIIIIIIIMIIMIIIIIIIMIIIIII 
361 agctgctaacggatggcagctttttgaatctctcagatgaattgtgccaaggacccaagg 420 



NOVl: 
1667 



1608 catcgtctcacaagtcctgtgccaggacagactgtcctccacatttagctgtgggagact 



55 I M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M I M 1 1 1 N 1 1 1 1 1 1 1 M I M M 1 1 1 1 1 1 M 1 1 M I 1 1 1 1 1 I 

SECR : 421 catcgtctcacaagtcctgtgccaggacagactgtcctccacatttagctgtgggagact 480 
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NOVl: 
1727 



1668 ggtcgaagtgtt ct gtcagt tgt ggtgtt ggaatccagagaagaaagcaggtgtgt caaa 



SECR : 481 



IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIII 
ggtcgaagtgttctgtcagttgtggtgttggaatccagagaagaaagcaggtgtgtcaaa 



540 



NOVl : 
1787 



10 



1728 ggctggcagccaaaggtcggcgcatccccctcagtgagatgatgtgcagggatctaccag 



SECR : 541 



lllllllllllllllllllllllllllllllllllllllllllllllllllllllllill 
ggctggcagccaaaggtcggcgcatccccctcagtgagatgatgtgcagggatctaccag 600 



15 



NOVl: 
1847 



1788 ggctccctcttgtaagatcttgccagatgcctgagtgcagtaaaatcaaatcagagatga 



SECR : 601 



llllllllllllllllllllllilllllllllllflllllllMIIIIIIIIIIIMIII 
ggctccctcttgtaagatcttgccagatgcctgagtgcagtaaaatcaaatcagagatga 



660 



20 



NOVl: 
1907 



1848 -agacaaaacttggt gagcagggt ccgcagat cctcagt gt ccagagagtct acatt caga 



SECR : 661 



MlllilMIIIIIMMIIMIMIIIIIIIIIIIIIIIIIIIIIMIIIIItllllll 
agacaaaacttggtgagcagggtccgcagatcctcagtgtccagagagtctacattcaga 



720 



25 



30 



35 



40 



NOVl: 
1967 



NOVl: 
2027 

SECR 



NOVl: 
2087 



1908 caagggaagagaagcgtattaacctgaccattggtagcagagcctatttgctgcccaaca 



SECR : 721 



tlllllMIIMIIIIIIIIIIIMIUIIIIIIIIIIIIllllllllMllllllllll 
caagggaagagaagcgtattaacctgaccattggtagcagagcctatttgctgcccaaca 



781 



SECR : 841 



780 



1968 catccgtgattattaagtgcccagtgcgacgattccagaaatctctgatccagtgggaga 



llllllllllillillllllll IIIIIIIIIIIIIIIIIIIIIIIIIIIIMIllllll 
catccgtgattattaagtgccccgtgcgacgatt ccagaaatctctgatccagtgggaga 840 



2028 aggatggccgttgcctgcagaactccaaacggcttggcatcaccaagtcaggctcactaa 



lllllllllllllllllilllllllllllllllllllllllllllllllllllltlllll 
aggatggccgttgcctgcagaactccaaacggcttggcatcaccaagtcaggctcactaa 900 



NOVl : 2088 aaatccacggtcttgctgcccccgacatcggcgtgtaccggtgcattgcaggctctgcac 
2147 

45 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ) 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

SECR : 901 aaatccacggtcttgctgcccccgacatcggcgtgtaccggtgcattgcaggctctgcac 960 



NOVl : 2148 aggaaacagttgtgctcaagctcattggtactgacaaccggctcatcgcacgcccagccc 
50 2207 

IIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIMIIIIIIIIMIIIIIMI 
SECR : 961 aggaaacagttgtgctcaagctcattggtactgacaaccggctcatcgcacgcccagccc 

1020 

55 

NOVl : 2208 tcagggagcctatgagggaatatcctgggatggaccacagcgaagccaatagtttgggag 
2267 

IIIIIIIIIIIIIIIIIIIIIIMIilllMIIIIIIIIIIIIIIIIMIMIIIIIIII 
SECR : 1021 tcagggagcctatgagggaatatcctgggatggaccacagcgaagccaatagtttgggag 

60 1080 
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NOVl: 
2327 

SECR 
1140 



2268 tcacatggcacaaaatgaggcaaatgtggaataacaaaaatgacctttatctggatgatg 

llllllllllllllllllllllllllllinilMllllllllllllllllillllllll 
1081 tcacatggcacaaaatgaggcaaatgtggaataacaaaaatgacctttatctggatgatg 



10 



NOVl: 
2387 

SECR ; 
1200 



2328 accacattagtaaccagcctttcttgagagctctgttaggccactgcagcaattctgcag 

IIIIIIIIIIIIIIIIIIIIIIIIIillllllllllllllllllMIIIIIIIIIIIIII 
1141 accacattagtaaccagcctttcttgagagctctgttaggccactgcagcaattctgcag 



15 



20 



NOVl: 
2447 

SECR 
1260 



2388 gaagcaccaactcctgggagttgaagaataagcagtttgaagcagcagttaaacaaggag 

llllllllllillllllllllilillllllllMIIIIIIMIIIIIIIIIIMIMIII 
1201 gaagcaccaactcctgggagttgaagaataagcagtttgaagcagcagttaaacaaggag 



25 



NOVl: 
2507 

SECR 
1320 



2448 catatagcatggatacagcccagtttgatgagctgataagaaacatgagtcagctcatgg 



1261 



llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 
catatagcatggatacagcccagtttgatgagctgataagaaacatgagtcagctcatgg 



30 



NOVl: 
2567 

SECR 
1380 



2508 aaaccggagaggtcagcgatgatcttgcgtcccagctgatatatcagctggtggccgaat 

llllllllllllinillllllllllllllllllllllllllllllllllllllllllli 
. 1321 aaaccggagaggtcagcgatgatcttgcgtcccagctgatatatcagctggtggccgaat 



35 



40 



NOVl: 
2627 

SECR 
1440 



2568 tagccaaggcacagccaacacacatgcagtggcggggcatccaggaagagacacctcctg 



1381 



llllllllllllllllllllllllllllllllllllllllllllllllllllllllllil 
tagccaaggcacagccaacacacatgcagtggcggggcatccaggaagagacacctcctg 



45 



NOVl; 
2687 

SECR : 
1500 



2628 ctgctcagctcagaggggaaacagggagtgtgtcccaaagctcgcatgcaaaaaactcag 

llllllllllllllllllllllllllllllllllllllllllllllllllllllllilll 
1441 ctgctcagctcagaggggaaacagggagtgtgtcccaaagctcgcatgcaaaaaactcag 



50 



55 



NOVl: 
2747 

SECR 
1560 



2688 gcaagctgacattcaagccgaaaggacctgttctcatgaggcaaagccaacctccctcaa 

IMIIIIIIIIIIIilMIIIMIIIIIIIIIIIIIIIIIIIilllllllNINMIII 
1501 gcaagctgacattcaagccgaaaggacctgttctcatgaggcaaagccaacctccctcaa 



60 



NOVl: 
2807 

SECR 
1620 



2748 tttcatttaataaaacaataaattccaggattggaaatacagtatacattacaaaaagga 



1561 



llllllilllllllillllllllllllllllllllllllllllllilllllllllIMM 
tttcatttaataaaacaataaattccaggattggaaatacagtatacattacaaaaagga 
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NOVl: 
2867 

SECR 
1680 



2808 cagaggtcatcaatatactgtgtgaccttattacccccagtgaggccacatatacatgga 

IIIIIIMIIIIIMIIIIIIIItllllllllMIIIIIIIIIIMMIIIIIIIIIIII 
1621 cagaggtcatcaatatactgtgtgaccttattacccccagtgaggccacatatacatgga 



10 



NOVl: 
2927 

SECR 
1740 



2868 ccaaggat ggaacctt gtt acagccct cagt aaaaat aattt tggat ggaact gggaaga 

IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIillill 
1681 ccaaggatggaaccttgttacagccctcagtaaaaataattttggatggaactgggaaga 



15 



20 



NOVl: 
2987 

SECR 
1800 



2928 tacagatacagaatcctacaaggaaagaacaaggcatatatgaatgttctgtagctaatc 

IIIIIMIIIIIMIItllllllMllllllllllllllllMIIIIIIIIIMIIIIII 
1741 tacagatacagaatcctacaaggaaagaacaaggcatatatgaatgttctgtagctaatc 



25 



NOVl: 
3047 

SECR 
1860 



2988 atcttggttcagatgtggaaagttcttctgtgctgtatgcagaggcacctgtcatcttgt 

lllllllllllllllMIIIIIIIIIIIIIIIMIIIMIIIIMIIIMIIilMIIII 
1801 atcttggttcagatgtggaaagttcttctgtgctgtatgcagaggcacctgtcatcttgt 



30 



NOVl: 
3107 

SECR 
1920 



3048 ctgttgaaagaaatatcaccaaaccagagcacaaccatctgtctgttgtggttggaggca 

IIIIIIIMIMIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIilllllMIIII 
1861 ctgttgaaagaaatatcaccaaaccagagcacaaccatctgtctgttgtggttggaggca 



35 



40 



NOVl: 
3167 

SECR 
1980 



3108 tcgtggaggcagcccttggagcaaacgtgacaatccgatgtcctgtaaaaggtgtccctc 

llllllllllllllllllllllllltllllllllllilllllllllllllllllllllll 
1921 tcgtggaggcagcccttggagcaaacgtgacaatccgatgtcctgtaaaaggtgtccctc 



45 



NOVl: 
3227 

SECR 
2040 



3168 agcctaatataacttggttgaagagaggaggatctctgagtggcaatgtttccttgcttt 

llllllllllllllllllllllillllllllllllllltlllllllllllllllllilll 
1981 agcctaatataacttggttgaagagaggaggatctctgagtggcaatgtttccttgcttt 



50 



55 



NOVl: 
3287 

SECR 
2100 



3228 tcaatggatccctgttgttgcagaatgtttcccttgaaaatgaaggaacctacgtctgca 

IIIIIIIIIIItlllillllllMIIMIIIMIIIIIIIIlllllllllllltllllM 
2041 tcaatggatccctgttgttgcagaatgtttcccttgaaaatgaaggaacctacgtctgca 



60 



NOVl: 
3347 

SECR 
2160 



3288 tagccaccaatgctcttggaaaggcagtggcaacatctgtactccacttgctggaacgaa 



2101 



IIIIIIIIIIIIMIIIIIlllMllllllllllllllllllllllilillMlllllll 
tagccaccaatgctcttggaaaggcagtggcaacatctgtactccacttgctggaacgaa 
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NOVl: 
3407 

SECR 
2220 



3348 gatggccagagagtagaatcgtatttctgcaaggacataaaaagtacattctccaggcaa 

IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIII 
2161 gatggccagagagtagaatcgtatttctgcaaggacataaaaagtacattctccaggcaa 



10 



NOVl: 
3467 

SECR 
2280 



3408 ccaacactagaaccaacagcaatgacccaacaggagaacccccgcctcaagagccttttt 

llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 
2221 ccaacactagaaccaacagcaatgacccaacaggagaacccccgcctcaagagccttttt 



15 



20 



NOVl: 
3527 

SECR 
2340 



3468 gggagcctggtaactggtcacattgttctgccacctgtggtcatttgggagcccgcattc 



2281 



MIMIIIilllllllMIIIIMIIIIIIIMIIIIIIIIIIIIMIIIIIItllllll 
gggagcctggtaactggtcacattgttctgccacctgtggtcatttgggagcccgcattc 



25 



NOVl: 
3587 

SECR 
2400 



3528 agagaccccagtgtgtgatggccaatgggcaggaagtgagtgaggccctgtgtgatcacc 



2341 



llllllllllllllllllllllilllllllllllllllllllllllllllllllllllll 
agagaccccagtgtgtgatggccaatgggcaggaagtgagtgaggccctgtgtgatcacc 



30 



NOVl: 
3647 

SECR 
2460 



3588 tccagaagccactggctgggtttgagccctgtaacatccgggactgcccagcgaggtggt 

, IMillllllllllllMlllltlMIIIMIillllllllllMIIIIIIIIMIIMI 
2401 tccagaagccactggctgggtttgagccctgtaacatccgggactgcccagcgaggtggt 



35 



40 



NOVl: 
3707 

SECR 
2520 



3648 tcacaagtgtgtggtcacagtgctctgtgtcttgcggtgaaggataccacagtcggcagg 



2461 



lllllllllllllltlllllllirilllllllllllllllllllllllllllllllllll 
tcacaagtgtgtggtcacagtgctctgtgtcttgcggtgaaggataccacagtcggcagg 



45 



NOVl: 
3767 

SECR 
2580 



3708 tgacgtgcaagcggacaaaagccaatggaactgtgcaggtggtgtctccaagagcatgtg 

IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMMIIIIillilllllll 
2521 tgacgtgcaagcggacaaaagccaatggaactgtgcaggtggtgtctccaagagcatgtg 



50 



55 



NOVl: 
3827 

SECR 
2640 



3768 cccctaaagaccggcctctgggaagaaaaccatgttttggtcatccatgtgttcagtggg 



2581 



lllllllllllllllllllllllltillllllllMIIIIIIIIIIIIIIIIIIIIIIII 
cccctaaagaccggcctctgggaagaaaaccatgttttggtcatccatgtgttcagtggg 



60 



NOVl: 
3887 

SECR 
2700 



3828 aaccagggaaccggtgtcctggacgttgcatgggccgtgctgtgaggatgcagcagcgtc 

IIIIIIIMIIIIIIIIIIIMIIIIIIIIIMIMIIIIIIIIIIIIIIIIIIIIIMI 
2641 aaccagggaaccggtgtcctggacgttgcatgggccgtgctgtgaggatgcagcagcgtc 
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NOVl: 
3947 

SECR ; 
2760 



3888 acacagctt gtcaacacaacagct ctgact ccaact gtgatgacagaaagagacccacct 

llllMlllllllllllllllllillllilMMIIIIIIIIIIIIIIIMIIIMIIII 
2701 acacagctt gtcaacacaacagct ctgact ccaactgt gat gacagaaagagacccacct 



10 



NOVl: 
4007 

SECR 
2820 



3948 taagaaggaactgcacatcaggggcctgtgatgtgtgttggcacacaggcccttggaagc 



2761 



IIIIIIIIIIIIMMIIIIIIIIIIIIMIIIIIIIMIIIIIIIIIIIIMIIMIII 
taagaaggaactgcacatcaggggcctgtgatgtgtgttggcacacaggcccttggaagc 



15 



20 



NOVl: 
4067 

SECR 
2880 



4008 cctgtacagcagcctgtggcaggggtttccagtctcggaaagtcgactgtatccacacaa 

llllllIIIIIIMIIIIIIIIMIIIIIIIIIIIMIIIIIIMIMItllMMIMI 
2821 cctgtacagcagcctgtggcaggggtttccagtctcggaaagtcgactgtatccacacaa 



25 



NOVl: 
4127 

SECR 
2940 



4068 ggagttgcaaacctgt ggccaagagacact gtgtacagaaaaagaaaccaatttcct ggc 



2881 



lllllllllilllltllllllllllMlllllllllilllllllllllMIIMIIIIII 
ggagttgcaaacctgtggccaagagacactgtgtacagaaaaagaaaccaatttcctggc 



30 



NOVl : 
4187 

SECR 
3000 



4128 ggcactgtcttgggccctcctgtgatagagactgcacagacacaactcactactgtatgt 

llllllMlllllllllllllltllllllllttllllllMIIIMIMIilllllllll 
2941 ggcactgtcttgggccctcctgtgatagagactgcacagacacaactcactactgtatgt 



35 



40 



NOVl: 
4247 

SECR 
3060 



4188 ttgtaaaacatcttaatttgtgttctctagaccgctacaaacaaaggtgctgccagtcat 

lltllllllllllllllllllllllllllMIIMIIIIIIMIMIMMMMIIMI 
3001 ttgtaaaacatcttaatttgtgttctctagaccgctacaaacaaaggtgctgccagtcat 



45 



NOVl: 
4307 

SECR 
3120 



4248 gtcaagagggataaacctttggaggggtcatgatgctgctgtgaagataaaagtagaata 

MIIIIIIIMIIIMNMIIIIIIIIIIINIIIMIIIIIIIIMIIMIMMIM 
3061 gtfcaagagggataaacctttggaggggtcatgatgctgctgtgaagataaaagtagaata 



50 



55 



NOVl: 
4367 

SECR 
3180 



4308 taaaagctcttttccccatgtcgctgattcaaaaacatgtatttcttaaaagactagatt 

MIMIllllllllllllMlllllllllllllillMIIMIIIIMMIIMMIMI 
3121 taaaagctcttttccccatgtcgctgattcaaaaacatgtatttcttaaaagactagatt 



60 



NOVl: 
4427 

SECR 
3240 



4368 ctatggatcaaacagaggttgatgcaaaaacaccactgttaaggtgtaaagtgaaatttt 

IIIIIMIIIIIMIIMIIilllMIINIMIIMIIIIIIIIIIINMIIIIIIII 
3181 ctatggatcaaacagaggttgatgcaaaaacaccactgttaaggtgtaaagtgaaatttt 
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NOVl: 
4487 

SECR 
3300 



4428 ccaatggt agtttt at att ccaatt ttt t aaaat gat gt attcaaggat gaacaaaatac 



3241 



lilllllllllllllllllllllllllMllillMIIMIIIIIMlllllMMMII 
ccaatggtagttttatattccaattttttaaaat^atgtattcaaggatgaacaaaatac 



10 



NOVl: 
4547 

SECR 
3360 



4488 tatagcatgcatgccactgcacttgggacctcatcatgtcagttgaatcgagaaatcacc 

lltlllllMIIIIIMIIIMIIMIIIIIIIIIIMIMIIIMIMiMIIMIIII 
3301 tatagcatgcatgccactgcacttgggacctcatcatgtcagttgaatcgagaaatcacc 



15 



20 



25 



NOVl: 
4607 

SECR 
3420 



NOVl: 
4667 

SECR 
3480 



4548 aagattatgagtgcatcctcacgtgctgcctctttcctgtgatatgtagactagcacaga 

MIIIIIIIMIIMIIIIMIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIl 
3361 aagattatgagtgcatcctcacgtgctgcctctttcctgtgatatgtagactagcacaga 



4608 gtggtacatcctaaaaacttgggaaacacagcaacccatgacttcctcttctctcaagtt 

llllllllllllllllllilllllllllllllllllllllltlillllllllllllllll 
3421 gtggtacatcctaaaaacttgggaaacacagcaacccatgacttcctcttctctcaagtt 



30 



NOVl: 
4727 

SECR 
3540 



4668 gcaggttttcaacagttttataaggtatttgcattttagaagctctggccagtagttgtt 

IIIIIIIIIIIMIIIIIII1IIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIMMIII 
3481 gcaggttttcaacagttttataaggtatttgcattttagaagctctggccagtagttgtt 



35 



40 



NOVl: 
4787 

SECR 
3600 



4728 aagatgttggcattaatggcattttcatagatccttggtttagtctgtgaaaaagaaacc 



3541 



lllllllMIIIIMIIIIIIIItlllMIIIIIIIIIIMMIllllllMIIIIIIII 
aagatgttggcattaatggcattttcatagatccttggtttagtctgtgaaaaagaaacc 



45 



NOVl: 
4847 

SECR 
3660 



4788 atctctctggataggctgtcacactgactgacctaagggttcatggaagcatggcatctt 



3601 



llllllllllllllllllllllllllllllllllllllllllllllllMllllllllli 
atctctctggataggctgtcacactgactgacctaagggttcatggaagcatggcatctt 



50 



55 



NOVl: 
4907 

SECR 
3720 



4848 gtccttgcttttagaacacccatggaagaaaacacagagtagatattgctgtcatttata 

lllllllllllllllllllllllllllilllMIIIIIIIIIIIIllllllMIIMIM 
: 3661 gtccttgcttttagaacacccatggaagaaaacacagagtagatattgctgtcatttata 



60 



NOVl: 
4967 

SECR 
3780 



4908 caactacagaaatttatctatgacctaatgaggcatctcggaagtcaaagaagagggaaa 



3721 



IIIMIIIIIIIIIIIIIIIIIIMIMIIIIIIIIIMtlllllllllllllllMIII 
caactacagaaatttatctatgacctaatgaggcatctcggaagtcaaagaagagggaaa 
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NOVl: 
5027 

SECR 
3840 



4968 gttaaccttttctactgatttcgtagtatattcagagctttcttttaagagctgtgaatg 

lllllllllllllllMIIIMMIIIllllllllllllllillllllllllllMIIII 
3781 gttaaccttttctactgatttcgtagtatattcagagctttcttttaagagctgtgaatg 



10 



NOVl: 
5087 

SECR 
3900 



5028 aaactttttctaagcactattctattgcacacaaacagaaaaccaaagccttattagacc 

llllllllllllllllllllllltllllllllllllllllllllllllllllllllllll 
3841 aaactttttctaagcactattctattgcacacaaacagaaaaccaaagccttattagacc 



15 



20 



NOVl; 
5147 

SECR 
3960 



5088 taatttatgcataaagtagtattcctgagaactttattttggaaaatttataagaaagta 

IIIMMIIMIIIIIMIMMIIIIIMIIIMIIIIIIIMIIIIIIIIMIIIIII 
3901 taatttatgcataaagtagtattcctgagaactttattttggaaaatttataagaaagta 



25 



NOVl: 
5207 

SECR 
4020 



5148 at ccaaataagaaacacgatagttgaaaataatttttatagtaaataattgttttgggct 



3961 



IIMIMIIIIIIIIIIIIIIMIIIIIIilllMllillllMIIMIIIIIIIIIIII 
atccaaataagaaacacgatagttgaaaataatttttatagtaaataattgttttgggct 



30 



NOVl: 
5267 

SECR 
4080 



5208 gatttttcagtaaatccaaagtgacttaggttagaagttacactaaggaccaggggttgg 

linillllllllllllllllllllllllllllllllllllllllllMIIIIIIIIMI 
4021 gatttttcagtaaatccaaagtgacttaggttagaagttacactaaggaccaggggttgg 



35 



40 



NOVl: 
5327 

SECR 
4140 



5268 aatcagaatttagtttaagatttgaggaaaagggtaagggttagtttcagttttaggatt 

lllllllllllllllllilllllllllllllllllllillllllllllllllllllllll 
4081 aatcagaatttagtttaagatttgaggaaaagggtaagggttagtttcagttttaggatt 



45 



NOVl: 
5387 

SECR 
4200 



5328 agagctagaattgggttaggtgagaaagaaagttaaggttaaggctagagttgtctttaa 

llllllllllllllllllllllllllllllllllilllllllltlltlllllltllllll 
4141 agagctagaattgggttaggtgagaaagaaagttaaggttaaggctagagttgtctttaa 



50 
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NOVl: 
5447 

SECR 
4260 



5388 gggttagggttaggaccaggttaggtcagggttggattgggtttagattggggccagtgc 

IIIMIIIMIIIIIMIIIIIIIIIIIIllMlllillllllMIIIIIMIIIIIIII 
4201 gggttagggttaggaccaggttaggtcagggttggattgggtttagattggggccagtgc 



60 



NOVl: 
5507 

SECR 
4320 



5448 t ggt gttagtgat agtgt caggat ggaggtt aggttt ggagt aagcgttgtt get gaagt 



4261 



IIIIIIMIIIMIIIIItllltllllllMllllllllllllllllMIIIIMIIIII 
tggtgttagtgatagtgtcaggatggaggttaggtttggagtaagcgttgttgctgaagt 
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NOVl: 
5567 

SECR 
4380 



5508 gagttcaggctagcattaaattgtaagttctgaagctgatttggttatggggtctttccc 

IIIIIMIIIIIIIIIIIIIIIIIIMIIIIilNlilllNMIIIIIIIIIMIIIII 
4321 gagttcaggctagcattaaattgtaagttctgaagctgatttggttatggggtctttccc 



10 



NOVl: 
5627 

SECR 
4440 



5568 ctgtatactaccagttgtgtctttagatggcacacaagtccaaataagtggtcatacttc 

illllllillllllllllMlllllllllllilllllMIIIIIIIIIIIIIIIIIMII 
4381 ctgtatactaccagttgt'gtctttagatggcacacaagtccaaataagtggtcatacttc 



15 



20 



NOVl: 
5687 

SECR 
4500 



5628 tttattcagggtctcagctgcctgtacacctgctgcctacatcttcttggcaacaaagtt 

IIIMMIIIIIIIIIIIIIIIIIIIIIIIMIIMMIIIIIIIIIIMIIIIIIIIII 
4441 tttattcagggtctcagctgcctgtacacctgctgcctacatcttcttggcaacaaagtt 



25 



NOVl: 
5747 

SECR 
4560 



5688 acct gccacaggct ctgct gagcct agtt cct ggt cagt aat aactgaacagtgcatttt 

IMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIII 
4501 acctgccacaggctctgctgagcctagttcctggtcagtaataactgaacagtgcatttt 



30 



NOVl: 
5807 

SECR 
4620 



5748 ggctttggatgtgtctgtggacaagcttgctgagtttctctaccatattctgagcacacg 

linillllllMIMIIIIIIIIIIIIIMIIIIIIMMIIIIIIIIIIIIIIIIIII 
4561 ggctttggatgtgtctgtggacaagcttgctgagtttctctaccatattctgagcacacg 



35 



40 



NOVl: 
5867 

SECR 
4680 



5808 gtctcttttgttctaatttcagcttcactgacactgggttgagcactactgtatgtggag 
4 621 gtctcttttgttctaatttcagcttcactgacactgggttgagcactactgtatgtggag 



45 



NOVl : 
5927 

SECR 
4740 



5868 ggt ttggtgattgggaatggatgggggacagtgaggaggacacaccagcccatt agtt gt 

llllllllllllllllllllllllllllllllllllllllllllllllllllllilllll 
4681 ggtttggtgattgggaatggatgggggacagtgaggaggacacaccagcccattagttgt 



50 



55 



NOVl: 
5987 

SECR 
4800 



5928 taatcatcaatcacatctgattgttgaaggttattaaattaaaagaaagatcatttgtaa 

IIIIIMIIIIMIIIMIIIIIIIIIIIIIillMlllllllllllllllMIIIIIII 
4741 taatcatcaatcacatctgattgttgaaggttattaaattaaaagaaagatcatttgtaa 



60 



NOVl: 
6047 

SECR 
4860 



5988 catactctttgtatatatttattatatgaaaggtgcaatattttattttgtacagtatgt 

MIIIMIIIIIIIIIIIIINIMNIMIIIIIMIIIIIIIIMIIIIIIMIIMI 
4801 catactctttgtatatatttattatatgaaaggtgcaatattttattttgtacagtatgt 



16 



wo 01/62928 



PCT/USOl/06151 



NOVl: 
6107 

SECR : 
4920 



6048 aataaagacatgggacatatatttttcttattaacaaaatttcatattaaattgcttcac 

IMMIIIIIIIIIIIIIIIIIIIIIIIMIIIMIIIIIIIIIIIIMMIIIIIIMI 
4861 aataaagacat gggacatat atttttcttattaacaaaatttcatattaaattgcttcac 



10 



NOVl: 
6167 

SECR 
4980 



6108 tttgtatttaaagttaaaagttactatttttcatttgctattgtactttcattgttgtca 

lllllllllllllllllllllllllllllllllllllllllllllllllllllllllill 
4921 tttgtatttaaagttaaaagttactatttttcatttgctattgtactttcattgttgtca 



15 



20 



NOVl: 
6227 

SECR 
5040 



6168 ttcaattgacattcctgtgtactgtattttactactgtttttataacatgagagttaatg 

IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIII 
4981 ttcaattgacattcctgtgtactgtattttactactgtttttataacatgagagttaatg 



NOVl: 
6287 



6228 tttctgtttcatgatccttatgtaattcagaaataaatttactttgattattcagtggca 



IIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 
25 SECR : 5041 tttctgtttcatgatccttatgtaattcagaaataaatttactttgattattcagtggca 
5100 



NOVl: 6288 tccttat 6294 (SEQ ID NO: 58) 
30 I N I I M 

SECR : 5101 tccttat 5107 (SEQ ID NO: 24) 



35 



Table 4 

Score = 2027 bits (5253), Expect =0.0 
Identities = 1023/1023 (100%), Positives 1023/1023 (100%) 



40 



45 



50 



55 



60 



NOVl: 


259 


SECR 


: 1 


NOVl: 


319 


SECR 


: 61 


NOVl: 


379 


SECR 


: 121 


NOVl: 


439 


SECR 


: 181 


NOVl: 


499 


SECR 


: 241 


NOVl: 


559 


SECR 


: 301 



AVCLHIQTQQTVNDSLCDMVHRPPAMSQACNTEPCPPRWHVGSWGPCSATCGVGIQTRDV 
AVCLHIQTQQTVNDSLCDMVHRPPAMSQACNTEPCPPRWHVGSWGPCSATCGVGIQTRDV 60 



YCLHPGETPAPPEECRDEKPHALQACNQFDCPPGWHIEEWQQCSRTCGGGTQNRRVTCRQ 
YCLHPGETPAPPEECRDEKPHALQACNQFDCPPGWHIEEWQQCSRTCGGGTQNRRVTCRQ 120 



LLTDGSFLNLSDELCQGPKASSHKSCARTDCPPHLAVGDWSKCSVSCGVGIQRRKQVCQR 



liAAKGRRIPLSEMMCRDLPGLPLVRSCQMPECSKIKSEMKTKLGEQGPQILSVQRVYIQT 



REEKRINLTIGSRAYLLPNTSVIIKCPVRRFQKSLIQWEKDGRCLQNSKRLGITKSGSLK 



IHGIAAPDIGVYRCIAGSAQETVVLKLIGTDNRLIARPALREPMREYPGMDHSEANSLGV 



17 



10 



15 



20 



25 



30 



35 



40 



WO 01/62928 

NOVl: 619 

SECR : 361 

NOVl: 679 

SECR : 421 

NOVl: 739 

SECR : 481 

NOVl: 799 

SECR : 541 

NOVl: 859 

SECR : 601 

NOVl: 919 

SECR : 661 

NOVl: 979 
1038 

SECR : 721 

NOVl: 1039 
1098 

SECR : 781 

NOVl: 1099 
1158 

SECR : 841 

NOVl: 1159 
1218 

SECR : 901 



PCTAJSOl/06151 

TWHKMRQMWNNKNDLYLDDDHISNQPFLRALLGHCSNSAGSTNSWELKNKQFEAAVKQGA 678 
TWHKMRQMWNNBCNDLYLDDDHISNQPFLRALLGHCSNSAGSTNSWELKNKQFEAAVKQGA 
TWHKMRQMWNNKNDLYLDDDHISNQPFLRTILLGHCSNSAGSTNSWELKNKQFEAAVKQGA 420 

YSMDTAQFDELIRNMSQLMETGEVSDDLASQLIYQLVAELAKAQPTHMQWRGIQEETPPA 738 
YSMDTAQFDELIRNMSQLMETGEVSDDLASQLIYQLVAELAKAQPTHMQWRGIQEETPPA 
YSMDTAQFDELIRNMSQIiMETGEVSDDLASQLIYQLVAELAKAQPTHMQWRGIQEETPPA 480 

AQLRGETGSVSQSSHAKNSGKLTFKPKGPVIiMRQSQPPSISFNKTINSRIGNTVYITKRT 798 
AQLRGETGSVSQSSHAKNSGKLTFKPKGPVLMRQSQPPSISFNKTINSRIGNTVYITKRT 
AQLRGETGSVSQSSHAKNSGKLTFKPKGPVLMRQSQPPSISFNKTINSRIGNTVYITKRT 540 

EVINILCDLITPSEATYTWTKDGTLLQPSVKIILDGTGKIQIQNPTRKEQGIYECSVANH 858 
EVINILCDLITPSEATYTWTKDGTLLQPSVKIILDGTGKIQIQNPTRKEQGIYECSVANH 
EVINILCDLITPSEATYTWTKDGTLLQPSVKIILDGTGKIQIQNPTRKEQGIYECSVANH 600 

LGSDVESSSVLYAEAPVILSVERNITKPEHNHLSVWGGIVEAALGANVTIRCPVKGVPQ 918 
LGSDVESSSVLYAEAPVILSVERNITKPEHNHLSVWGGIVEAAIiGANVTIRCPVKGVPQ 
LGSDVESSSVLYAEAPVILSVERNITKPEHNHLSVWGGIVEAALGANVTIRCPVKGVPQ 660 

PNITWLKRGGSLSGNVSLLFNGSLLLQNVSLENEGTYVCIATNALGKAVATSVLHLLERR 978 
PNITWLKRGGSIiSGNVSLLFNGSLLLQNVSLENEGTYVCIATNALGKAVATSVLHLLERR 
PNITWLKRGGSLSGNVSLLFNGSLLLQNVSLENEGTYVCIATNALGKAVATSVLHLLERR 720 

WPESRIVFLQGHKBCYILQATNTRTNSNDPTGEPPPQEPFWEPGNWSHCSATCGHLGARIQ 

WPESRIVFLQGHKKYILQATNTRTNSNDPTGEPPPQEPFWEPGNWSHCSATCGHLGARIQ 
WPESRIVFLQGHKKYILQATNTRTNSNDPTGEPPPQEPFWEPGNWSHCSATCGHLGARIQ 780 

RPQCVMANGQEVSEALCDHLQKPLAGFEPCNIRDCPARWFTSVWSQCSVSCGEGYHSRQV 

RPQCVMANGQEVSEALCDHLQKPLAGFEPCNIRDCPARWFTSVWSQCSVSCGEGYHSRQV 
RPQCVMANGQEVSEALCDHLQKPLAGFEPCNIRDCPARWFTSVWSQCSVSCGEGYHSRQV 840 

TCKRTKANGTVQWSPRACAPKDRPLGRKPCFGHPCVQWEPGNRCPGRCMGRAVRMQQRH 

TCKRTKANGTVQWSPRACAPKDRPLGRKPCFGHPCVQWEPGNRCPGRCMGRAVRMQQRH 
TCKRTKANGTVQWSPRACAPKDRPLGRKPCFGHPCVQWEPGNRCPGRCMGRAVRMQQRH 900 

tacqhnssdsncddrkrptlrrnctsgacdvcwhtgpwkpctaacgrgfqsrkvdcihtr 

tacqhnssdsncddrkrptlrrnctsgacdvcwhtgpwkpctaacgrgfqsrkvdcihtr 
tacqhnssdsncddrkrptlrrnctsgacdvcwhtgpwkpctaacgrgfqsrkvdcihtr 960 



45 



50 



55 



NOVl: 
1278 

SECR 
1020 



1219 SCKPVAKRHCVQKKKPISWRHCLGPSCDRDCTDTTH YCMFVKHLNLCSLDRYKQRCCQSC 



961 



NOVl: 1279 
SECR : 1021 



SCKPVAKRHCVQICKKPISWRHCLGPSCDRDCTDTTHYCMFVKHLNLCSLDRYKQRCCQSC 
SCKPVAKRHCVQKECKPISWRHCLGPSCDEIDCTDTTHYCMFVKHLNLCSLDRYKQRCCQSC 



QEG 1281 (SEQ ID NO: 59) 
QEG 

QEG 1023 (SEQ ID NO: 25) 



Based the relatedness of NOV-1 to KIAA1233 sequences, which are related to lacunin, 
thromhospondins, proteinases, sem^horins, ADAM-TS and properdin family members, 
nucleic acids and proteins according to the invention likely have similar functions as proteins 
belonging to these femilies. Thus, the NOV-1 of the invention is implicated in the following 



18 
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diseases and processes and has fherapeutic uses in these diseases and processes: (i) 
inflanmiation, (ii) cancer, (iii) neuronal development and axonal guidance, (iv) angiogenesis 
and vasculogenesis - in cancer as well as for ischemia, and (v) tissue regeneration in vivo and 
in vitro, (vi) and other diseases and disorders. 
5 Functional roles attributed to this funily of proteins include ceU attachment, spreading, 

motility, and proUferation, cytoskeletal organization, wound healing, and angiogenesis. 
Moreover, these proteins are expressed in the nervous systems during development and are 
thought to play roles in neuronal growth and patterning. In particular, the thromhospondin, 
METH-1 and ADAMTS famiUes of proteins are potent inhibitors of angiogenesis. The 
10 ADAMTS proteins have also been implicated in cleavage of proteglycans and the control of 
organ sh^e during development In addition, the thrombospondins have been implicated in 
the activation of both transforming growth factor-beta (TGF-p) precursors and TGF-P in a 
variety of disease states. Furthermore, sem^horin proteins have shown expression in 
undiflferentiated neuroepithelium, suggesting that these proteins are actors in axonal guidance, 

15 

NOV 2: A Novel KIAA1233-like Protein 

The NOV-2 sequences according to Ihe invention include nucleotide sequences 
encoding a polypeptide related to KIAA1233 proteins, which bear sequence similarity to 
lacunin, thrombospondins, proteinases, sems^horins, ADAM-TS, and properdin family 
20 members. 

NOV2a and N0V2b are splice variants. Splice variants are sequences that occur 
naturally within the cells and tissues of individuals. The physiological activity of splice variant 
products and the original protein, from which they are varied, may be the same (although 
perhaps at a different level), opposite, or completely different and unrelated. In addition, 

25 variants may have no activity at all. When a variant and the original sequence have the same 
or opposite activity, they may differ in various properties not directly connected to biological 
activity, such as stability, clearance rate, tissue and cellular localization, temporal pattern of 
expression, vtp or down regulation mechanisms, and responses to agonists or antagonists. The 
presence or level of specific splice variants may be the cause, and/or indicative of, a disease, 

30 disorder, pathological or normal condition. 

Because a drug may be effective against one variant but not another, or may cause side 
effects because it targets aU splice variants, an effective drug needs to target the particular 
splice variant. Because soluble variants with therapeutic or disease-related functions may be 
naturally occuning in specific tissues, they may be optimal candidates for drug targets or 
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protein therapeutics. Variants may have no activity at all and may thus serve as dominant 
negative natural inhibitors. Thus, splice variants useful in generating new drug targets, protein 
therapeutics and markers for diagnostics. 

NOV-2 maps to Unigene cluster Hs. 1 8705. This cluster has been msq^ped to 
5 Chromosome 15 Marker stSG35204, Interval D15S115-D15S15Z By integrating information 
fix)m the Online Mendelian Inheritance in Man (OMIM), this region is identified as 15q22- 
qter. Therefore, the chromosomal location of the invention is Chromosome 15 Marker 
stSG35204, Interval D15S115-D15S152 which corresponds to 15q22-qter. 

10 NOV-2a 

A N0V-2a nucleic add of flie inveution, encoding a KIAA1233-like protein 
originating jBcom chromosome 15 is shown in TABLE 5. The disclosed nucleic acid (SEQ ID 
NO: 3) is 7260 nucleotides and contains an open reading fi:ame (ORF) that begins with an 
ATG initiation codon at nucleotide 136 and ends with a TAA stop codon at nucleotides 5209. 
15 The representative ORF encodes a 1691 amino acid polypeptide (SEQ ID NO: 4). The 
initiation and stop codons of SEQ ID NO: 3 are shown in bold font. The protein has a 
predicted molecular weight of 188743.8 daltons. Putative untranslated regions are i5)stxeam of 
the initiation codon and downstream of the stop codon in SEQ ID NO: 3. 

20 TABLES 

CXX3VCX3M6TGTTGACX3GGCX3GCrT€TO 

CCMaTACTTCTGCGGCGCMGGCTAC^ 

GGGTGCTGATAGGGA^TGGTCrrCAaXXa^CTCTCCC^ 

CraSAGTTTGCACOTTCTCCTCAGGGAAGTT^ 
25 CCMACCTCAACSAAACACTTOTTCAGATGAAGAC^^ 

GQACXITCTGGGGGAGGAGCATCArATTCTCTGCGGA^ 

AAGACATGCAGCAATCATGACTGCCCTCCAGATGCAGAA!^ 

GTATCMGGGCATTACTATGAATGGCTTCCACGATATAATGA^ 

GACAAAACTTGGTGGTGGAfiCTGGCACCTATijGGTACT 
30 AGTGGCATCTGT(3iGGCAGTGGGCrGCGATCGGCAACT<^^ 

CGATGGCTCCACCTGCAGGCriOTACGGGGACAATCAAAGTa 

CTGTTCCTTTGGGAAGTOSAAGTGTGAGAATTACAGTGAAA^^ 

GGAAGCy^AAGGAGAACAOiGCTTTAACAGCCCCXSKSa^ 

CGAGAGGCAAACTTTTAAGATTCCACSQACCTCTG^ 
35 GOnXKSTTCaGTTCTTCOTTTACCAGCCC^ 

GGAGGAGGTTATCa^GCrCAATTCTGCTGAATGTQTGGATATCC^^ 

CTACCXTCAAAATGTAAAACCAAAACCAAAACTGAAtK^ 

AGATAATGOICTATGACCACTTCaUVCCTCrTCCT 
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GGRGGGATTCAGI^CGGRGCrrrGTGTGTG^ 

CATGTACGCACaaUUVCCCAAGGTTATGCAAAC^^ 

ASTGCACAGTGACTTGTGGCOaAGGGOTAC^ 

TGCAAT(XACa^CTGAAGTTACACATCAAAGAAGAAT^^ 
5 AGTGGAACKAAAATTGCCOTKMCTC3A^ 

TTCCAGAACCXrrcGTCAGCCKKaGTACC^ 

ACATTCACGCAGACTGAQACTGAGCTGCCC^ 

GGAAGCATaTGAraAGAGCCCGGCCTCCCQAGASCTAGACAl^^ 

AGTACGCTGGGTTO^CCCCTTGCAaVGCAACATGCGl^^ 
10 ATCOiGaUSACAGTCAATGACA 

CnxSTCCCCCCaWSGTGGCa^TGTGGGCTOTK^^ 

ACTGCCTGCACCCACSGGGAaACCCCTGCCan'CX^^ 

CSWSTTTGACTGCCrTCCTGGCa^^ 

AAGAGTCaCCrroTCXSGCAGCTGCTAACGGATGGC^^ 
1 5 amn^CAOU^GTCCTOTGCCAGKSACMACra 

GGTGTTGGAATCaW3AGAAGAAAaCAC3GTGTGTCAAAGGCr^^ 

GTGCMGGATCrACCAGGGTTCairrCTTGTAA^^ 

CAAAACTOXSGTGAGCAGGGTCCGCa^TCCTCAGT^ 

CTGACCATTGGTAGCAGAGCCTATTTGCTGCCCAACACATC 
20 TCT^TCCAGTGGGAGAAGGATGGCTOTTGCCTGCACS^ 

TCCACXSGTCTTGCTGCCCCaSAaTCGGOjrGT^ 

ATTGGTACTOAa^OKKnraiTCGCACGa^^ 

AGCCAATAGTTTGGGAGTCACATGGCACAAAATGACSGC^ 

AmTTAGTAACCavSCCTTTCmaAGAGCTCTGT^ 
25 AAGAATAAGCACmTOAAGCAGCAGTTAAACAAGGAGa^TATAC^ 

mTGAGTCaVGCTCATGGAAACCGGAGAGGTCAGC^ 

CCAaGGCACAGCCaiACACAa\TGCAGtGGaK3GGCATC^^ 

GGGAGTGTGTaxaiAAGCTCGCATGCAAAAAACTCAGG(^^ 

AAGCCAACCTCrCTCAATTTCATTTAATAAAACAATAAATTCC^^ 
30 AGGTOITCAATATACTGTGTGACCTTATTACCCCCAGTGAGGCCACAT^^ 

CCCTCAGTAAAAATAATTTTGGATGGAACTGGGAAGATACAGATAC^^ 

ATGTTCT6TAGCTAATCa.TCrTGGTTCAGATGTGGAAAGOT 

TTGAT^GAAATATCACCAAACCAGACKJACAACCATCTO 

AAa3TGACAATCCX3ATGTCCTGTAAAAGGTGTCCCTC^^ 
35 CAATGOTTCCOTGCTTTTCAATGGATCCCTGTTGT^ 

CCACCAATGCTCnTOGAAAGGCAGTGGCAACATC^ 

TTTCTGCAAGGACTlTAAAAAGTACATTCTCaVGGCAACC^ 

GCCTCAAQAGCCTTOTTGGGAGCCTGGTAACroCT 

GACrcCMTGTGTGATGGCO^TGGGCAGGRAGTQAGTQRG^ 
40 QAGCCCnBTAACATCCGGGACTGCCCaWSCXiaGGTGQTTm 

ATACCACAGTCX3GCAGGTGACX3TGCAAGCGGACAAAM 

CTAAAQACOGGCCTCTGGGAAGW^CCATGTTTTGGTm^ 

CGTTGCATGGGCOSTGCTOTGAGGATGCAGCAGCGTCA^ 

CMAAAGAGACCCaVCCTTAAGAAGGAACTGCACATC^^ 
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GTACAGCaGCCTCTGGCaGGGGTTTCCAGTCT^ 

i^CACTOroTACAGAAAAAGAAACaATTTCC^^ 

AACTCACTACTGTATGTTTGTAAAACATCTTAATTTGTGra 

AAGAGGQATA&ACXITTTGGAGGGGTCATGATGCTGC^^ 
5 CTGATTCAAAAACATGTATTTCTTAAAAQACTAGATTC^^ 

GTOTAAAGTGAAATTTTCaVATGGTAGTTTTATATTCOVA 

ACSCATGCATGCCACTGCACTTGGGACOTCAI^ 

TGCTGCCTCTTTCCTGTGATATGTAGACTAGCTU:^ 

TCCTCTTCTCTOVAGTTGCAGGTTTTCyVACAGTO 
10 ATGTTGGCATTAATGGCATTTTCATAGATCCTTGGTTTAGTC^ 

CTGACTGACCTAAGGGTTCATGGAAGGATGGCATCmOT 

TATTGCTGTCATTTATACAACTACAGAAATTTATCTATGACCrA^ 

AACCmTTCTACTGATTTCGTAGTATATTCAGAGCTTTC^^ 

ATTGCACACAAACAGAAAACCAAAGCCTTATTAGACCra 
1 5 AAATTTATAAGAAAGTAATCCMATAAGa^CAaSATAGTTGAAAATAAT^^ 

TTTTCaGTAAATCOWWWSTGACTTAGGTTAGAAGra 

GAGGAAAAGGGTAAGGGTTAGIOTCAGTTTTAGGATTAGAGCTA^ 

GCrAGAGTTOTCTTTAAGGGTTAGGGTTAGGACCAGGTTAGCT 

TOTTAGTGATAGTQTCaGGATGGAGGTTAGGTTTGGAG^ 
20 TAAGTTCTGATVGCTGATTTGGTTATGGGGTCTTTCCCCT^ 

ATAAgTGGTCATA CXTClXl ATTCAGGGTCTCT^GCTC^^ 

TGOaVCAGGCTCKSCIXSAGCCT 

AGCTTGCTGAGTTTCTCTACCATATT^^ 

CTVCrrACTGTATGTGGAGGGTTTGGTGATTGC^^ 
25 TCATCAATCA<3VTCTGATTGTTGAAGGTTATTAAATTAAAAGA 

ATATGfiAAGGTGCAATATTTTATTTTGTACAGTATCTAATAAAGACA 

ATATTAAATTGCTTCACam^TATTTAAAGTTAAAAGTTAC^^ 

AATTGACATTCCTGTGTACTGTATTTTACTACTGra 

AATTOiGAAATATATTTACTTTGaTTATT^^ (SEQ ID NO: 3) 

30 



MASWTSPWWVLIGMVFMHSPLPQTTAEKSPGAYFLPEFALSPQGSFLEDTTGEQFLTYRYDDQTSRNTRSDEDKDG 
NWDAWGDWSDCSRTCGGGASYSLRRCLTGRNCEGQNIRYKTCSNHDCPPDAEDFRAQQCSAYNDVQYQGHYYEWLP 
RYNDPAAPCALKCHAQGQNLVVEIiAPK\n^DGTRCNTDSLDMCISGICQAVGCDRQLGSNAKEDN 

35 LVRGQSKSHVSPEiCREENVIAVPLGSRSVRITVKGPAHLFIESKTLQGSKGEHSFNSPGVFVVENTrVEF 

QTFKIPGPLMADFIFBCTRYTAT^DSWQFFFYQPISHQWRQTDFFPCTVTCGGGYQLNSAECVDIRLECRVVPDHYC 
HYYPEm^PKPKLKECSMDPCPSSDGFBOEIMPYDHFQPLPRWEHNPWTACSVSCGGGIQRRSFVC^^ 
VEEWKCMYAPKPKVMQTCNLFDCPKWIAMEWSQCTVTCGRGLRYRVVLCINHRGEHVGGCNPQLKLHIKE^ 
PCYKPKEKSPVEAKLPWLKQAQELEETRIATEEPTFIPEPWSACSTTCGPGVQVREVKCRVLLTFTQTETELPEEE 

40 CEGPKLPTERPCIiLEACDESPASRELDIPLPEDSETTYDWEYAGFTPCTATCVGGHQEAIAVCLHIQTQQTVNDSIj 
CDMVHRPPAMSQACNTEPCPPRWHVGSWGPCSATCGVGIQTRDVYCLHPGETPAPPEECRDEKPHALQACNQFDCP 
PGWHIEEWQQCSRTCGGGTQNRRVTCRQLLTDGSFLNLSDELCQGPECASSHKSCARTIX:PPHIiAVGDWSKCSVSCG 
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vgiqrrkqvcqriaakgrriplsemmcrdlpgfplvrscqmpecskiksemktklgeqgpqilsvqrvyiq 
rinltigsrayij:.pntsviikcpvrrpqksliqwekdgrclqnskrlgitksgslkihgij^ 
etvvlkligtdnrliarpalrepmreypgmdhseanslgvtwhkmrqmwnnkndlyldddhis 
nsagstnswelknkqfeaavkqgaysmdtaqfdelirnmsqlmetgevsddlasqliyqlvi^ 
5 iqeetppt^qlrgetgsvsqsshtycnsgkltfkpkgpvlmrqsqppsisfnktinsrigno^itkr^ 
litpseatytwtkdgtllqpsvkiildgtgeaqiqnptrkeqgiyecsvanhlgsdvesssvlyae^ 
itkpehnhlsvwggiveaalgaim:ircpvkgvpqpnitwijcrggslsgwsllfngs 
atnalgkavatsvfhllerrwpesrivflqghkkyilqatntrtnsndptgepppqepfwepgnwshcsatcghlg 
ariqrpqcvmangqevsealcdhlqkpiiagfepcnirdcparwftswsqcsvscgegyhsrqwckrtkangw 

10 WSPRACAPKDRPLGRKPCFGHPCVQWEPGNRCPGRCMGRAVRMQQRHTACQHNSSDSNCDDRKRPTLRRNC^ 

CDVCWHTGPWKPCTAACGRGFQSRKVDCIHTRSCKPVAKRHCVQKKKPISWRHCLGPSCDRDCTDTTHYCMFV^ 
NLCSLDRYKQRCCQSCQEG (SEQ ID NO: 4) 



In a search of sequence databases, it was foimd, for example, that the disclosed NOV- 
15 2a nucleotide sequence has 5104 of 5107 bases (99%) identical to a human mRNA for a 
KIAA1233 protein (GenBank Accession No: ABO33059), as shown in Table 6. In all 
sequence alignments, identical residues are depicted as "|*\ As indicated by the *TSxpecf * 
value, the probability of this alignment occurring by chance alone is 0.0, the lowest 
probability. 

20 Furthermore, the encoded amino acid sequence has 1023 of 1023 amino acid residues 

(100%) identical to, and 1021 of 1023 residues (100 %) positive with, a 1023 amino acid 
residue human KIAA1233 protein (GenBank Accession No: BAA86547), as shown m Table 
7. As indicated by the 'Bxpecf ' value, the probability of this alignment occurring by chance 
alone is 0.0, the lowest probability. 

25 

TABLE 6 

Score = l.OlOe+04 bits (5095), Expect =0.0 
Identities = 5104/5107 (99%) 
Strand = Plus / Plus 



N0V2a : 2138 tagcagtgtgcttacatatccagacccagcagacagtcaatgacagcttgtgtgatatgg 
2197 

I I III I I III Itll I III MM III MM I III I III I I II I II II I MM II I llllll 
35 SECR : 1 tagcagtgtgcttacatatccagacccagcagacagtcaatgacagcttgtgtgatatgg 60 



N0V2a : 2198 tccaccgtcctccagccatgagccaggcctgtaacacagagccctgtccccccaggtggc 
2257 

NIIIMIMIMIIIMMMMMIIMMIMMMMMIMMIIIIIIIIMM 
SECR : 61 tccaccgtcctccagccatgagccaggcctgtaacacagagccctgtccccccaggtggc 
120 
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N0V2a 
2317 

SECR : 
180 



2258 atgtgggctcttgggggccctgctcagctacctgtggagttggaattcagacccgagatg 



121 



llllllllllllllllllllllllllllllilllllllllllllllllllllllllllll 
atgtgggctcttgggggccctgctcagctacctgtggagttggaattcagacccgagatg 



10 



N0V2a 
2377 

SECR : 
240 



: 2318 tgtactgcctgcacccaggggagacccctgcccctcctgaggagtgccgagatgaaaagc 

Nil llll Mil NIMMI nil MM III I Ml I MM III I II 1 1 III I Ml I INI 
181 tgtactgcctgcacccaggggagacccctgcccctcctgaggagtgccgagatgaaaagc 



15 



20 



N0V2a 
2437 

SECR : 
300 



: 2378 cccatgctttacaagcatgcaatcagtttgactgccctcctggctggcacattgaagaat 

MM Mi llll II II llllllll MM! MM llll MM I III Ml MM Mill llll 
241 cccatgctttacaagcat gcaat cagt tt gact gccct cctggctggcacat tgaagaat 



25 



N0V2a 
2497 

SECR : 
360 



: 2438 ggcagcagtgttccaggacttgtggcgggggaactcagaacagaagagtcacctgtcggc 

IIMIIIIIIMIIIIIIIIIIiMMIIIIIIIIIIIllllMIIIMIIIMIIMII 
301 ggcagcagtgttccaggacttgtggcgggggaactcagaacagaagagtcacctgtcggc 



30 



35 



40 



N0V2a 
2557 

SECR : 
420 



N0V2a 
2617 

SECR : 
480 



: 2498 agctgctaacggatggcagctttttgaatctctcagatgaattgtgccaaggacccaagg 

IIIIMIIMMIIMMIIIIIMIIIIM IIMIMIIIIIMIIIIMIMIMIII 
361 agctgctaacggatggcagctttttgaatctctcagatgaattgtgccaaggacccaagg 



: 2558 catcgtctcacaagtcctgtgccaggacagactgtcctccacatttagctgtgggagact 

MMMIMIIIMIIIMIIMMM MMIIIIMMMMIMMMMMMMM 
421 catcgtctcacaagtcctgtgccaggacagactgtcctccacatttagctgtgggagact 



45 



N0V2a 
2677 

SECR : 
540 



: 2 618 ggtcgaagtgttctgtcagttgtggtgttggaatccagagaagaaagcaggtgtgtcaaa 

lllMlllllllilllllllMMIMIIMIIIIIIIIMIIIIIIIIIIIIIIIMII 
481 ggtcgaagtgttctgtcagttgtggtgttggaatccagagaagaaagcaggtgtgtcaaa 



50 



55 



N0V2a 
2737 

SECR : 
600 



: 2678 ggctggcagccaaaggtcggcgcatccccctcagtgagatgatgtgcagggatctaccag 

IIIIIIIIIIIIIMIIIIIIIIIIIMII MMIIIIIIMIIIMMII MM Mill 
541 ggctggcagccaaaggtcggcgcatccccctcagtgagatgatgtgcagggatctaccag 



60 



N0V2a 
2797 

SECR : 
660 



: 2738 ggttccctcttgtaagatcttgccagatgcctgagtgcagtaaaatcaaatcagagatga 

II IIIIIIIIIIIIIMIIIIMIIIMIIIIIMIIIIIIIIIIIIIIIIIIIIIIII 
601 ggctccctcttgtaagatcttgccagatgcctgagtgcagtaaaatcaaatcagagatga 
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N0V2a 
2857 

SECR 
720 



: 2798 agacaaaacttggtgagcagggtccgcagatcctcagtgtccagagagtctacattcaga 



661 



IIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIMIIIIIMIIIIIilll 
agacaaaacttggtgagcagggtccgcagatcctcagtgtccagagagtctacattcaga 



10 



N0V2a 
2917 

SECR : 
780 



2858 caagggaagagaagcgtattaacctgaccattggtagcagagcctatttgctgcccaaca 



721 



II Mill MM Ml 1 1 III ill! MM MM I II I MMIIMMII MM MM MM I 
caagggaagagaagcgtattaacctgaccattggtagcagagcctatttgctgcccaaca 



15 



20 



25 



N0V2a 
2977 

SECR : 
840 



N0V2a 
3037 

SECR : 
900 



2918 catccgtgattattaagtgccccgtgcgacgattccagaaatctctgatccagtgggaga 



781 



IMMMMIMMIMMIIMMIMM IMMIMMIMMM IMIIMMMM 
catccgtgattattaagtgccccgtgcgacgattccagaaatctctgatccagtgggaga 



2978 aggatggccgttgcctgcagaactccaaacggcttggcatcaccaagtcaggctcactaa 



841 



MMMMMMMMMMMIIMMMIIIIII MM MUM liMIMIIIMM 
aggatggccgttgcctgcagaactccaaacggcttggcatcaccaagtcaggctcactaa 



30 



35 



40 



N0V2a 
3097 

SECR 
960 



N0V2a 
3157 

SECR : 
1020 



: 3038 aaatccacggtcttgctgcccccgacatcggcgtgtaccggtgcattgcaggctctgcac 



901 



IIMIMMMIMMMMMMIIIMMIMIMIIIMMMI MM IIMMIII 
aaatccacggtcttgctgcccccgacatcggcgtgtaccggtgcattgcaggctctgcac 



3098 aggaaacagttgtgctcaagctcattggtactgacaaccggctcatcgcacgcccagccc 



961 



MM Ml Mill MM MM II II MM MM MM MM MM MM III MM MM I 
aggaaacagttgtgctcaagctcattggtactgacaaccggctcatcgcacgcccagccc 



45 



N0V2a 
3217 

SECR 
1080 



: 3158 tcagggagcctatgagggaatatcctgggatggaccacagcgaagccaatagtttgggag 



1021 



IIIIMIMIIIIIilllllMIIMIMIMMIIIIIMIIMIIMIIIlMiiMI 
tcagggagcctatgagggaatatcctgggatggaccacagcgaagccaatagtttgggag 



50 



55 



60 



N0V2a : 3218 tcacatggcacaaaatgaggcaaatgtggaataacaaaaatgacctttatctggatgatg 
3277 

MMMMMMMIMMMMMMMMMMIMMMMMMIIMMMirM 
SECR : 1081 tcacatggcacaaaatgaggcaaatgtggaataacaaaaatgacctttatctggatgatg 

1140 



N0V2a 
3337 

SECR : 
1200 



3278 accacattagtaaccagcctttcttgagagctctgttaggccactgcagcaattctgcag 



1141 



IIIIMIIIIIIIIMIIIIIIIIIIIIIMIIIIIMMMIMIMIIMIIMMII 
accacattagtaaccagcctttcttgagagctctgrttaggccactgcagcaattctgcag 
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N0V2a 
3397 

SECR : 
1260 



3338 gaagcaccaactcctgggagttgaagaataagcagtttgaagcagcagttaaacaaggag 



1201 



lllllllllllllllllllllllllllllllllllllllllllllllllllillllllll 
gaagcaccaactcctgggagttgaagaataagcagtttgaagcagcagttaaacaaggag 



N0V2a : 3398 catatagcatggatacagcccagtttgatgagctgataagaaacatgagtcagctcatgg 
3457 

10 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I 1 1 I I I I I I I 

SECR : 1261 catatagcatggatacagcccagtttgatgagctgataagaaacatgagt cagctcatgg 
1320 



15 N0V2a : 3458 aaaccggagaggtcagcgatgatcttgcgtcccagctgatatatcagctggtggccgaat 
3517 

lllllllllllllllllllllllllllllilllllllllllllllllllllllllillll 
SECR : 1321 aaaccggagaggtcagcgatgatcttgcgtcccagctgatatatcagctggtggccgaat 
1380 

20 

N0V2a : 3518 tagccaaggcacagccaacacacatgcagtggcggggcatccaggaagagacacctcctg 
3577 

lllllllllllllltlllllllllllllMllllllllllllllllllllllllllllll 
25 SECR : 1381 tagccaaggcacagccaacacacatgcagtggcggggcatccaggaagagacacctcctg 
1440 



N0V2a : 3578 ctgctcagctcagaggggaaacagggagtgtgtcccaaagctcgcatgcaaaaaactcag 
30 3637 

IIIIMIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIMIIIMIIIIIIIIIIil 
SECR : 1441 ctgctcagctcagaggggaaacagggagtgtgrtcccaaagctcgcatgcaaaaaactcag 
1500 

35 

N0V2a : 3638 gcaagctgacattcaagccgaaaggacctgttctcatgaggcaaagccaacctccctcaa 
3697 

llllllllllllllltlllllllllllMMIIIIIIlMlllllllMIIMlllllll 
SECR : 1501 gcaagctgacattcaagccgaaaggacctgttctcatgaggcaaagccaacctccctcaa 
40 1560 



N0V2a : 3698 tttcatttaataaaacaataaattccaggattggaaatacagtatacattacaaaaagga 
3757 

45 1 1 1 1 1 M 1 1 1 1 i I 1 1 1 1 1 1 1 I M 1 1 1 M 1 1 1 1 i 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 i I M 1 1 1 1 I M I 

SECR : 1561 tttcatttaataaaacaataaattccaggattggaaatacagtatacattacaaaaagga 
1620 



50 NOV2a : 3758 cagaggtcatcaatatactgtgtgaccttattacccccagtgaggccacatatacatgga 
3817 

iiiiiiiiiiiiMiiiiiiiiiiiiiiiiiiiiMiiiiiiiiiiiiiiiniiiiiii 

SECR : 1621 cagaggtcatcaatatactgtgtgaccttattacccccagtgaggccacatatacatgga 
1680 

55 

N0V2a : 3818 ccaaggatggaaccttgttacagccctcagtaaaaataattttggatggaactgggaaga 
3877 

MIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIII 
60 SECR : 1681 ccaaggatggaaccttgttacagccctcagtaaaaataattttggatggaactgggaaga 
1740 
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N0V2a 
3937 

SECR : 
1800 



: 3878 tacagatacagaatcctacaaggaaagaacaaggcatatatgaatgttctgtagctaatc 

IMIMIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIII 
1741 tacagatacagaatcctacaaggaaagaacaaggcatatatgaatgttctgtagctaatc 



10 



N0V2a 
3997 

SECR : 
1860 



: 3938 atcttggttcagatgtggaaagttcttctgtgctgtatgcagaggcacctgtcatcttgt 

IMMIMIIIIIMIIMIIIMIIMIIIIIMIMIIIMIIIMMIIIIIIIIM 
1801 atcttggttcagatgtggaaagttcttctgtgctgtatgcagaggcacctgtcatcttgt 



15 



20 



N0V2a 
4057 

SECR 
1920 



3998 ctgttgaaagaaatatcaccaaaccagagcacaaccatctgtctgttgtggttggaggca 

llllllilllllllllllllllllllllllllllllllllllllllllllllllllllll 
: 1861 ctgttgaaagaaatatcaccaaaccagagcacaaccatctgtctgttgtggttggaggca 



25 



N0V2a 
4117 

SECR : 
1980 



I 4058 tcgtggaggcagcccttggagcaaacgtgacaatccgatgtcctgtaaaaggtgtccctc 

Mlllllilllllllllllllllllllllllllillllllllllllllllllilllllll 
1921 t cgt ggaggcagccct t ggagcaaacgt gacaat ccgat gt cct gt aaaaggt gt ccctc 



30 



N0V2a 
4177 

SECR : 
2040 



: 4118 agcctaatataacttggttgaagagaggaggatctctgagtggcaatgtttccttgcttt 

I I I I 1 1 I I I 1 1 I I I I I i 1 1 I I I I I I I I I 1) I I 1 1 I I I 1 1 1 1 1 M I I I I I i I I I 1 1 1 11 M 
1981 agcctaatataacttggttgaagagaggaggatctctgagtggcaatgtttccttgcttt 



35 



40 



N0V2a 
4237 

SECR : 
2100 



4178 tcaatggatccctgttgttgcagaatgtttcccttgaaaatgaaggaacctacgtctgca 



2041 



IIMIMMIIII illlllllllNII IMMIIII ItllllllMIIMIIIMMIM 
tcaatggatccctgttgttgcagaatgtttcccttgaaaatgaaggaacctacgtctgca 



45 



N0V2a 
4297 

SECR : 
2160 



: 4238 tagccaccaatgctcttggaaaggcagtggcaacatctgtattccacttgctggaacgaa 

IIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIItllill llllllllllllllllll 
2101 tagccaccaatgctcttggaaaggcagtggcaacatctgtactccacttgctggaacgaa 



50 



55 



N0V2a 
4357 

SECR : 
2220 



: 4298 ga t ggccagagagtagaat cgt at ttct gcaaggacat aaaaagt acattct ccaggcaa 

lllllllllllllllllllllllllllllllllilllllllllllllllllllltlllll 
2161 gatggccagagagtagaatcgtatttctgcaaggacataaaaagtacattctccaggcaa 



60 



N0V2a 
4417 

SECR : 
2280 



: 4358 ccaacactagaaccaacagcaatgacccaacaggagaacccccgcctcaagagccttttt 

lllllllllllllllllllllllllllllllllllllllltlllllllllllllllllll 
2221 ccaacactagaaccaacagcaatgacccaacaggagaacccccgcctcaagagccttttt 
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N0V2a 
4477 

SECR : 
2340 



4418 gggagcctggtaactggtcacattgttctgccacctgtggtcatttgggagcccgcattc 



2281 



MIIIMIMIMIIIMIMIMMMMIMIIMMMMMIMMIMMIIMI 
gggagcctggrtaactggtcacattgttctgccacctgtggtcatttgggagcccgcattc 



10 



N0V2a 
4537 

SECR : 
2400 



4478 agagaccccagtgtgtgatggccaatgggcaggaagtgagtgaggccctgtgtgatcacc 



2341 



MIIMIIMIMIIIIIIIIilllMMIIIIIIMIIIIMIIIIIIIIIIMIIIII 
agagaccccagtgtgtgatggccaatgggcaggaagtgagtgaggccctgtgtgatcacc 



15 



20 



25 



N0V2a 
4597 

SECR : 
2460 



N0V2a 
4657 

SECR : 
2520 



4538 tccagaagccactggctgggtttgagccctgtaacatccgggactgcccagcgaggtggt 



2401 



IIMIIItllllllllllllllllllllllllllllllllllllllllllilltllllll 
tccagaagccactggctgggtttgagccctgtaacatccgggactgcccagcgaggtggt 



4598 tcacaagtgtgtggtcacagtgctctgtgtcttgcggtgaaggataccacagtcggcagg 



2461 



inilllillillllllllllltllllliMllllllllllMllllillllllllllll 
tcacaagtgtgtggtcacagtgctctgtgtcttgcggtgaaggataccacagtcggcagg 



30 



35 



40 



N0V2a 
4717 

SECR : 
2580 



4 658 tgacgtgcaagcggacaaaagccaatggaactgtgcaggtggtgtctccaagagcatgtg 



2521 



llllllllllllllllllllllllllllllllilllllllilllllllllllllllilll 
tgacgtgcaagcggacaaaagccaatggaactgtgcaggtggtgtctccaagagcatgtg 



N0V2a : 4718 cccctaaagaccggcctctgggaagaaaaccatgttttggtcatccatgtgttcagtggg 
4777 

lllllllllllillllllllllllllMMIIIIMIIIIIIIIlllllMIIIMIIM 
SECR : 2581 cccctaaagaccggcctctgggaagaaaaccatgttttggtcatccatgtgttcagtggg 

2640 



45 



N0V2a 
4837 

SECR : 
2700 



4778 aaccagggaaccggtgtcctggacgttgcatgggccgtgctgtgaggatgcagcagcgtc 



2641 



IIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIMMIIIIIIIIIIIIIIIMI 
aaccagggaaccggtgtcctggacgttgcatgggccgtgctgtgaggatgcagcagcgtc 



50 



55 



60 



N0V2a 
4897 

SECR 
2760 



N0V2a 
4957 

SECR ; 
2820 



; 4838 acacagcttgtcaacacaacagctctgactccaactgtgatgacagaaagagacccacct 



2701 



MIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMININMIill 
acacagcttgtcaacacaacagctctgactccaactgtgatgacagaaagagacccacct 



4898 taagaaggaactgcacatcaggggcctgtgatgtgtgttggcacacaggcccttggaagc 



2761 



IIIIIIIIIIIIMIIIMIMIIIIIIIIIIIIIIMIIlllllllllilMIIIIIM 
taagaaggaactgcacatcaggggcctgtgatgtgtgttggcacacaggcccttggaagc 
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N0V2a 
5017 

SECR : 
2880 



4958 cctgtacagcagcctgtggcaggggtttccagtctcggaaagtcgactgtatccacacaa 



2821 



iiiitiiiiiiiniiniiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiMiiiiiii 

cctgtacagcagcctgtggcaggggtttccagtctcggaaagtcgactgtatccacacaa 



10 



N0V2a 
5077 

SECR : 
2940 



5018 ggagttgcaaacctgtggccaagagacactgtgtacagaaaaagaaaccaatttcctggc 



2881 



IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIII 
ggagttgcaaacctgtggccaagagacactgtgtacagaaaaagaaaccaatttcctggc 



15 



20 



25 



N0V2a 
5137 

SECR : 
3000 



N0V2a 
5197 

SECR : 
3060 



5078 ggcactgtcttgggccctcctgtgatagagactgcacagacacaactcactactgtatgt 



2941 



IIIMIIIIIIIIIIIIMIII-INIIIIIMMIIIIIIIIIIIIIIIIIIIIIIMM 
ggcactgtcttgggccctcctgtgatagagactgcacagacacaactcactactgtatgt 



5138 ttgtaaaacatcttaatttgtgttctctagaccgctacaaacaaaggtgctgccagtcat 



3001 



IIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIlltllllllllMII 
ttgtaaaacatcttaatttgtgttctctagaccgctacaaacaaaggtgctgccagtcat 



30 



35 



40 



N0V2a 
5257 

SECR : 
3120 



N0V2a 
5317 

SECR : 
3180 



5198 gtcaagagggataaacctttggaggggtcatgatgctgctgtgaagataaaagtagaata 



3061 



IIIIIIIIIIIIMIIIIIIIIMIIMIIIIIIIIIIIIIIinillllllMIIIIII 
gtcaagagggataaacctttggaggggtcatgatgctgctgtgaagataaaagtagaata 



5258 taaaagctcttttccccatgtcgctgattcaaaaacatgtatttcttaaaagactagatt 



3121 



iiiiiiiiiiiiiiiliiiiiiiiiiiiiiiiiiiiiiiiliiiiiliiiiiiiiiMii 

taaaagctcttttccccatgtcgctgattcaaaaacatgtatttcttaaaagactagatt 



45 



N0V2a 
5377 

SECR : 
3240 



5318 ctatggatcaaacagaggttgatgcaaaaacaccactgttaaggtgtaaagtgaaatttt 



3181 



llllinMIIIIMIIIIIIIIllIIMIIillllltlllllllllllilllllllMI 
ctatggatcaaacagaggttgatgcaaaaacaccactgttaaggtgtaaagtgaaatttt 



50 



55 



60 



N0V2a 
5437 

SECR 
3300 



N0V2a 
5497 

SECR : 
3360 



: 5378 ccaatggtagttttatattccaattttttaaaatgatgtattcaaggatgaacaaaatac 



3241 



I II i III I Ml I II II MM I II I 111 I II 1 1 III I IN IMMIII I MM III I III I 
ccaatggtagttttatattccaattttttaaaatgatgtattcaaggatgaacaaaatac 



5438 tatagcatgcatgccactgcacttgggacctcatcatgtcagttgaatcgagaaatcacc 



3301 



IMIMMMII lilMMMMMMMilllMMMIIMIIIMMMMIiliM 
tatagcatgcatgccactgcacttgggacctcatcatgtcagttgaatcgagaaatcacc 



29 



20 



35 



45 



55 



WO 01/62928 PCT/USOl/06151 

N0V2a : 5498 aagattatgagtgcatcctcacgtgctgcctctttcctgtgatatgtagactagcacaga 
5557 

MMIIMIIIIIIIIIIIIMIIIIMIIIIIIIIIIMMIIIIIIIIIIIIIllIll 

SECR : 3361 aagattatgagtgcatcctcacgtgctgcctctttcctgtgatatgtagactagcacaga 
3420 



N0V2a : 5558 
5617 



gtggtacatcctaaaaacttgggaaacacagcaacccatgacttcctcttctctcaagtt 



10 IIIMIIIIIIIIIIMIIIIMIilllllillllllllllllllllllllllllMIll 
SECR : 3421 gtggtacatcctaaaaacttgggaaacacagcaacccatgacttcctcttctctcaagtt 

15 ^^^^^ " ^^'^^ gcaggttttcaacagttttataaggtatttgcattttagaagctctggccagtagttgtt 

(ill INIIMIIIIIMIMMMtllMIMINIMIItlllllllMIIIMIMI 
SECR : 3481 gcaggttttcaacagttttataaggtatttgcattttagaagct ctggccagtagttgtt 



3540 



N0V2a : 5678 aagatgttggcattaatggcattttcatagatccttggtttagtctgtga^aaagaaacc 
5737 

. . I M I 1 1 I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 

25 SECR : 3541 aagatgttggcattaatggcattttcatagatccttggtttagtctgtgaaaaagaaacc 
3600 



N0V2a : 5738 atctctctggataggctgtcacactgactgacctaagggttcatggaagcatggcatctt 
30 5797 

IMIIIIIIIIIIIIIIMIIMIIIIIIIIIIIMIMIIIIIIMIIIIIIIIIIIll 
SECR : 3601 atctctctggataggctgtcacactgactgacctaagggttcatggaagcatggcatctt 



40 3720 



3660 



N0V2a ; 5798 gtccttgcttttagaacacccatggaagaaaacacagagtagatattgctgtcatttata 
5857 

IMIIIIIIIiinillMIIIIIIIIIIIIIIIIIIIMIIIMllllllMIIIIIII 
SECR : 3661 gtccttgcttttagaacacccatggaagaaaacacagagtagatattgctgtcatttata 



N0V2a : 5858 caactacagaaatttatctatgacctaatgaggcatctcggaagtcaaagaagagggaaa 
5917 

IliMMIIMIMIIMIIIIIIIIIIIilllllMIIlllllllllllllIIIIIIII 
SECR : 3721 caactacagaaatttatctatgacctaatgaggcatctcggaagtcaaagaagagggaaa 
3780 



50 N0V2a : 5918 gttaaccttttctactgatttcgtagtatattcagagctttcttttaagagctgtgaatg 
5977 

IN III Ml III fill M ill! Ill MINI III II IIIMIM 1 1 Mill III IIIMI 
SECR : 3781 gttaaccttttctactgatttcgtagtatattcagagctttcttttaagagctgtgaatg 



3840 



N0V2a : 5978 
6037 



60 SECR : 3841 
3900 



aaactttttctaagcactattctattgcacacaaacagaaaaccaaagccttattagacc 

lllllillirMIIIIIMIIIIMIIIIIMf IIIIMMMMI IMIIIIIIIIMI 
aaactttttctaagcactattctattgcacacaaacagaaaaccaaagccttattagacc 
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N0V2a 
6097 

SECR : 
3960 



6038 taatttatgcataaagtagtattcctgagaactttattttggaaaatttataagaaagta 



3901 



IIIIIIIIMIMIIMIIIIIIIIIIIMIMIIMMIIMIMMIIMIIMIMI 
taatttatgcataaagtagtattcctgagaactttattttggaaaatttataagaaagta 



10 



N0V2a 
6157 

SECR : 
4020 



6098 atccaaataagaaacacgatagttgaaaataatttttatagtaaataattgttttgggct 



3961 



llllllllllllllllllllllllllllllllllllilllllllllllllllllllllll 
atccaaataagaaacacgatagttgaaaataatttttatagtaaataattgttttgggct 



15 



20 



N0V2a 
6217 

SECR : 
40B0 



6158 gatttttcagtaaatccaaagtgacttaggttagaagttacactaaggaccaggggttgg 



4021 



IIIIIIIIIMIIIIIIIIIIilllllllllllllllMIIMIIIIIIIIMIIIIIII 
gatttttcagtaaatccaaagtgacttaggttagaagttacactaaggaccaggggttgg 



25 



N0V2a 
6277 

SECR : 
4140 



6218 aatcagaatttagtttaagatttgaggaaaagggtaagggttagtttcagttttaggatt 



4081 



IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMMIIIIIIIIMIIIII 
aatcagaatttagtttaagatttgaggaaaagggtaagggttagtttcagttttaggatt 



30 



35 



40 



N0V2a 
6337 

SECR : 
4200 



N0V2a 
6397 

SECR : 
4260 



6278 agagctagaattgggttaggtgagaaagaaagttaaggttaaggctagagttgtctttaa 



4141 



IIIIIIIIIIMIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 
agagctagaattgggttaggtgagaaagaaagttaaggttaaggctagagttgtctttaa 



; 6338 gggttagggttaggaccaggttaggtcagggttggattgggtttagattggggccagtgc 

lllllllllllllllllllllllllllllllllllllllllllllllllllllillllll 
4201 gggtt agggt t aggaccaggtt aggt cagggttggat tgggtttagatt ggggccagt gc 



45 



N0V2a 
6457 

SECR : 
4320 



6398 tggtgttagtgatagtgtcaggatggaggttaggtttggagtaagcgttgttgctgaagt 



4261 



llllllllllllllllllllllllllllllllllllllllllllllllllllilllllll 
tggtgttagtgatagtgtcaggatggaggttaggtttggagtaagcgttgttgctgaagt 



50 



55 



N0V2a 
6517 

SECR : 
4380 



6458 gagttcaggctagcattaaattgtaagttctgaagctgatttggttatggggtctttccc 



4321 



llllllllllllllllllllllltlllltllllllllllllllllllllllllllIMM 
gagttcaggctagcattaaattgtaagttctgaagctgatttggttatggggtctttccc 



60 



N0V2a 
6577 

SECR : 
4440 



6518 ctgtatactaccagttgtgtctttagatggcacacaagtccaaataagtggtcatacttc 



4381 



IIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIII 
ctgtatactaccagttgtgtctttagatggcacacaagtccaaataagtggtcatacttc 
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N0V2a : 
6637 

SECR 
4500 



6578 tttattcagggtctcagctgcctgtacacctgctgcctacatcttcttggcaacaaagtt 



4441 



IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIMIIIIIIIIMIII 
tttattcagggtctcagctgcctgtacacctgctgcctacatcttcttggcaacaaagtt 



10 



N0V2a 
6697 

SECR : 
4560 



6638 acctgccacaggctctgctgagcctagttcctggtcagtaataactgaacagtgcatttt 



4501 



iiiiiiiiiiiiiiiiiiiiiiiiniiiiiiiiiiiiiiiiiitiiiiiiiiiiiiiii 

acctgccacaggctctgctgagcctagttcctggtcagtaataactgaacagtgcatttt 



15 



20 



N0V2a 
6757 

SECR : 
4620 



6698 ggctttggatgtgtctgtggacaagcttgctgagtttctctaccatattctgagcacacg 



4561 



llllllllllllllillllllllillllllllllllllllllllllllllilllllllll 
ggctttggatgtgtctgtggacaagcttgctgagtttctctaccatattctgagcacacg 



25 



N0V2a 
6817 

SECR : 
4680 



6758 gtctcttttgttctaacttcagcttcactgacactgggttgagcactactgtatgtggag 



4621 



iiiiiiiiiiiiiiii iiiiiiiiitiiiiiiiiiiiMiiiiiiiiiniiiiiiiii 

gtctcttttgttctaatttcagcttcactgacactgggttgagcactactgtatgtggag 



30 



35 



40 



N0V2a 
6877 

SECR : 
4740 



N0V2a 
6937 

SECR 
4800 



6818 ggtttggtgattgggaatggatgggggacagtgaggaggacacaccagcccattagttgt 



4681 



MIMIIIIIIIIIIIIIIIIIIIMIMIIIIIIIIIilllMllllllllllllilll 
ggtttggtgattgggaatggatgggggacagtgaggaggacacaccagcccattagttgt 



: 6878 taatcatcaatcacatctgattgttgaaggttattaaattaaaagaaagatcatttgtaa 



4741 



MIIIIIIIIIIIIIIIIIMIIIIillillllllllllllllllllMlllllltllll 
taatcatcaatcacatctgattgttgaaggttattaaattaaaagaaagatcatttgtaa 



45 



N0V2a 
6997 

SECR 
4860 



6938 catactctttgtatatatttattatatgaaaggtgcaatattttattttgtacagtatgt 



: 4801 



lllllllllllllllllllllllMllllllllillllllllllllllllllllltMII 
catactctttgtatatatttattatatgaaaggtgcaatattttattttgtacagtatgt 



50 



55 



60 



N0V2a 
7057 

SECR 
4920 



N0V2a 
7117 

SECR : 
4980 



: 6998 aataaagacatgggacatatatttttcttattaacaaaatttcatattaaattgcttcac 



4861 



llinMIIIIIIIIIIIIIIIIIMIIIIIIIIIMItlllMlinillllllllMI 
aataaagacatgggacatatatttttcttattaacaaaatttcatattaaattgcttcac 



7058 tttgtatttaaagttaaaagttactatttttcatttgctattgtactttcattgttgtca 



4921 



IMIIf MIMIMIIIINIIilMIIMIIIIIIMIMIIMIIIIIMMIIIMI 
tttgtatttaaagttaaaagttactatttttcatttgctattgtactttcattgttgtca 
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N0V2a : 7118 ttcaattgacattcctgtgtactgtattttactactgtttttataacatgagagttaatg 
7177 

MM IIIMII IIIIIIIMIIIIIIMMMIIMMMIIIMIIIMIIIMIIIII 
SECR : 4981 ttcaattgacattcctgtgtactgtattttactactgtttttataacatgagagttaatg 
5 5040 



N0V2a : 7178 tttctgtttcatgatccttatgtaattcagaaataaatttactttgattattcagtggca 
7237 

10 M M M M M M M M I M M I M M M M M M M M M M t M M M M M M M M I 

SECR : 5041 tttctgtttcatgatccttatgtaattcagaaataaatttactttgattattcagtggca 
5100 



15 N0V2a : 7238 tccttat 7244 (SEQ ID NO: 60) 
MMMI 

SECR : 5101 tccttat 5107 (SEQ ID NO: 26) 



TABLET 

20 Score « 2045 bits (5300), Expect =0.0 

Identities = 1021/1023 (99%), Positives = 1021/1023 (99%) 

N0V2A: 669 AVCLHIQTQQTVNDSLCDMVHRPPAMSQACNTEPCPPRWHVGSWGPCSATCGVGIQTRDV 728 
AVCLHIQTQQTVNDSLCDMVHRPPAMSQACNTEPCPPRWHVGSWGPCSATCGVGIQTRDV 
25 SECR : 1 AVCLHIQTQQTVNDSLCDMVHRPPAMSQACNTEPCPPRWHVGSWGPCSATCGVGIQTRDV 60 

N0V2A: 729 YCLHPGETPAPPEECRDEKPHALQACNQFDCPPGWHIEEWQQCSRTCGGGTQNRRVTCRQ 788 

YCLHPGETPAPPEECRDEKPHALQACNQFDCPPGWHIEEWQQCSRTCGGGTQNRRVTCRQ 
SECR : 61 YCLHPGETPAPPEECRDEKPHALQACNQFDCPPGWHIEEWQQCSRTCGGGTQNRRVTCRQ 120 

30 

N0V2A: 789 LLTDGSFLNLSDELCQGPKZVSSHKSCARTDCPPHLAVGDWSKCSVSCGVGIQRRKQVCQR 848 

LLTDGSFLNLSDELCQGPKASSHKSCARTDCPPHLAVGDWSKCSVSCGVGIQRRKQVCQR 
SECR : 121 IiLTDGSFLNLSDELCQGPKASSHKSCARTDCPPHLAVGDWSKCSVSCGVGIQRRKQVCQR 180 

35 N0V2A: 849 LAAKGRRIPLSEMMCRDLPGFPLVRSCQMPECSKIKSEMKTKLGEQGPQILSVQRVYIQT 908 
LAAKGRRIPLSEMMCRDLPG PLVRSCQMPECSKIKSEMKTKLGEQGPQILSVQRVYIQT 
SECR : 181 LAAKGRRIPLSEMMCRDLPGLPLVRSCQMPECSKIKSEMKTKLGEQGPQILSVQRVYIQT 240 

N0V2A: 909 REEKRINLTIGSRAYLLPNTSVIIKCPVRRFQKSLIQWEKDGRCLQNSKRLGITKSGSLK 968 
40 REEKRINLTIGSRAYLLPNTSVIIKCPVRRFQKSLIQWEKDGRCLQNSKRLGITKSGSLK 

SECR : 241 REEKRINLTIGSRAYLLPNTSVIIKCPVRRFQKSLIQWEKDGRCLQNSKRLGITKSGSLK 300 

N0V2A: 969 IHGLAAPDIGVYRCIAGSAQETVVLKLIGTDNRLIARPALREPMREYPGMDHSEANSLGV 
1028 

45 IHGLAAPDIGVYRCIAGSAQETWLKLIGTDNRLIARPALREPMREYPGMDHSEANSLGV 

SECR : 301 IHGLAAPDIGVYRCIAGSAQETVVLKLIGTDNRLIARPALREPMREYPGMDHSEANSLGV 360 

N0V2A: 1029 TWHKtmQMWNNiCNDLYLDDDHISNQPFLRALLGHCSNSAGSTNSWELKNKQFEAAVKQGA 
1088 

50 TWHKMRQMWNNKNDLYLDDDHISNQPFLRALLGHCSNSAGSTNSWELKNKQFEAAVKQGA 

SECR : 361 TWHKMRQMWNNKNDLYLDDDHISNQPFLRALLGHCSNSAGSTNSP^ELKNKQFEAAVKQGA 420 

N0V2A: 1089 YSMDTAQFDELIRNMSQLMETGEVSDDLASQLIYQLVAELAKAQPTHMQWRGIQEETPPA 
1148 

55 YSMDTAQFDELIRNMSQLMETGEVSDDLASQLIYQLVAELAKAQPTHMQWRGIQEETPPA 

SEC : 421 YSMDTAQFDELIRNMSQLMETGEVSDDLASQLIYQLVAELAKAQPTHMQWRGIQEETPPA 480 

N0V2A: 1149 AQLRGET6SVSQSSHAKNSGKLTFKPKGPVLMRQSQPPSISFNKTINSRIGNTVYITKRT 
1208 

60 AQLRGETGSVSQSSHAKNSGKLTFKPKGPVLMRQSQPPSISFNKTINSRIGNTVYITKRT 
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SECR : 481 AQLRGETGSVSQSSHAKNSGKLTFKPKGPVLMRQSQPPSISFNKTINSRIGNTVYITKRT 540 

N0V2A: 1209 EVINILCDLITPSEATYTWTKDGTLLQPSVKIILDGTGKIQIQNPTRKEQGIYECSVANH 
1268 

5 EVINILCDLITPSEATYTWI^KDGTLIiQPSVKIILDGTGKIQIQNPTRKEQGIYECSVANH 

SECR : 541 EVINILCDLITPSEATYTWTKDGTLLQPSVKIILDGTGKIQIQNPTRKEQGIYECSVANH 600 

N0V2A: 1269 LGSDVESSSVLYAEAPVILSVERNITKPEHNHLSVVVGGIVEAAIiGANVTIRCPVKGVPQ 
1328 

1 0 LGSDVESSS VLYAEAPVILSVERNITKPEHNHLS VWGGI VEAAIiGANVTlRCPVKGVPQ 

SECR : 601 LGSDVESSSVLYAEAPVILSVERNITKPEHNHLSVWGGIVEAT^GANVTIRCPVKGVPQ 660 

N0V2A: 1329 PNITWLKRGGSLSGNVSLLFNGSLLLQNVSLENEGTYVCIATNALGKAVATSVFHLLERR 
1388 

15 PNITWLKRGGSLSGNVSLLFNGSLLLQNVSLENEGTYVCIATNALGKAVATSV HLLERR 

SECR : 661 PNITSaKRGGSLSGNVSLLFNGSLLLQNVSLENEGTYVCIATNALGKAVATSVLHLLERR 720 

N0V2A: 1389 WPESRIVFLQGHKECYILQATNTRTNSNDPTGEPPPQEPFWEPGNWSHCSATCGHLGARIQ 
1448 

20 WPESRIVFLQGHKKYILQATNTRTNSNDPTGEPPPQBPFWEPGNWSHCSATCGHLGARlQ 

SECR : 721 WPESRIVFLQGHKKYILQATNTRTNSNDPTGEPPPQEPFWEPGNWSHCSATCGHLGARIQ 780 

N0V2A: 1449 RPQCVMANGQEVSEALCDHLQKPLAGFEPCNIREK^PARWFTSVWSQCSVSCGEGYHSRQV 
1508 

25 RPQCVMANGQEVSEALCDHLQKPLAGFEPCNIRDCPARWFTSVWSQCSVSCGEGYHSRQV 

SECR : 781 RPQCVMANGQEVSEALCDHLQKPLAGFEPCNIRDCPARWFTSVWSQCSVSCGEGYHSRQV 840 

N0V2A: 1509 TCKRTKANGTVQVVSPRACAPKDRPLGRKPCFGHPCVQWEPGNRCPGRCMGRAVRMQQRH 
1568 

30 TCKRTKANGTVQWSPRACAPKDRPLGRKPCFGHPCVQWEPGNRCPGRCMGRAVRMQQRH 

SECR : 841 TCKRTECANGTVQWSPRACAPKDRPLGRKPCFGHPCVQWEPGNRCPGRCMGRAVRMQQRH 900 

N0V2A: 1569 TACQHNSSDSNCDDRKRPTLRRNCTSGACDVCWHTGPWKPCTAACGRGFQSRKVDCIHTR 
1628 

35 TACQHNSSDSNCDDRKRPTLRRNCTSGACDVCWHTGPWKPCTAACGRGFQSRKVDCIHTR 

SECR : 901 TACQHNSSDSNCDDRKRPTLRRNCTSGACDVCWHTGPWKPCTAACGRGFQSRKVDCIHTR 960 

N0V2A: 1629 SCKPVAKRHCVQKKKPISWRHCLGPSCDRDCTDTTHYCMFVKHLNLCSLDRYKQRCCQSC 
1688 

40 SCKPVAKRHCVQKKKPISWRHCLGPSCDRDCTDTTHYCMFVKHLNLCSLDRYKQRCCQSC 

SECR : 961 SCKPVAKRHCVQKKKPISWRHCLGPSCDRDCTDTTHYCMFVKHLNLCSLDRYKQRCCQSC 
1020 

N0V2A: 1689 QEG 1691 (SEQ ID NO: 61) 
45 QEG 

SECR : 1021 QEG 1023 (SEQ ID NO: 27) 



SignatP and PSORT analysis indicate that NOV-2 may be localized in the endoplasmic 
reticulum, with likely cleavage sites between positions 26 and 27. Thus, it is likely that NOV- 
50 2a protein is available at the appropriate sub-cellular localization for the ther^eutic uses 
described in tiiis ^plication. 

Based the relatedness of the disclosed N0V-2a to KIAA1233 sequences, which are 
related to lacunin, tbrombospondins, proteinases, semaphorins, ADAM-TS and properdin 
family members, the nucleic acids and proteins of the invention can have similar functions as 
55 proteins belongmg to these families. 
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Functional roles attributed to this family of proteins include cell attachment, spreading, 
motility, and proliferation, cytoskeletal organization, wound healing, and angiogenesis. 
Moreover, these proteins are expressed in the nervous systems during development and are 
thought to play roles in neuronal growth and patterning. In particular, tiie thrombospondin, 
5 METH-1 and ADAMTS families of proteins are potent inhibitors of angiogenesis. The 

ADAMTS proteins have also been implicated in cleavage of proteglycans and the control of 
organ sh^ during development In addition, the thrombospondins have heen implicated in 
the activation of both transforming growth factor- beta (TGF-p) precursors and TGF-p in a 
variety of disease states. Furthermore, semsphorin proteins have shown expression in 

10 undifferentiated neuroepitiiehum, suggesting that these proteins are actors in axonal guidance. 
Thus, the NOV-2a sequences of the invention is impUcated in the following diseases and 
processes and has therapeutic uses in fliese diseases and processes: (i) inflammation, (ii) 
cancer, (iii) neuronal development and axonal guidance, (iv) angiogenesis and vasculogeaesis 
- in csacGt as well as for ischemia, and (v) tissue regeneration in vivo and in vitro, (vi) and 

15 other diseases and disorders. 



A NOV-2b nucleic acid of the invention, encodmg a KIAA1233-like protein, is found 
witiiin the nucleotide sequence of N0V-2a (SEQ ID NO: 3) m Table 5. The disclosed nucleic 

20 acid is 6303 nucleotides in length and contains an open reading firame (ORF) that begins with 
an ATG initiation codon at nucleotide 425 and ends with a TAA stop codon at nucleotides 
4268 (SEQ ID NO: 57). The initiation and stop codons of N0V-2b are shown m bold font in 
SEQ ID NO: 4. The representative ORF encodes a 406 amino acid polypeptide (SEQ ID NO: 
5), which is shown below in Table 8. Putative untranslated regions are upstream of the 

25 initiation codon and downstream of the stop codon in SEQ ID NO: 57. 



NOV 2b: 



TABLE 8 



30 



40 



35 



TATAATTATTAATAGAGACCTTTCAAAGGACAAATTCTGTGAAATAAAGTGGTTTTCTGA 
AGAGCCTACTAATAGGACA6TGTGTTAATATCACTAATAAGAGAGTAATGATTATAAAAA 
GGAATAAATTTATTGAAATTGCAAGATACTTTTCTCCTTTGATTAATATACTGCTAGTTT 
AGTTTTCTACATTTTCAAATAGAACTGGGGAATTTGTGTCGTAGATATTCTTGACAACTA 
AAGAGATGGTGGCTGAATTTTTGGGAATGGTTGATAACACTTGATATTTTTAGTTTCCAA 
TTTGGAAGAGCTCTGTCTCTTGGGATGTCAAATATTATATTCGTCAATTAATGAATGTGT 
TAATTTATTATAGT^TGATATTCTCACAATGATTTCATTTGTAGTGATGGATTTAAAGA 
GATAATGCCCTATGACCACTTCCauVCCTCTTCCTCGCTGGGAACATAATCCTTGGACTGC 
ATGTTCCGTGTCCTGTGGAG6AGGGATTCA6AGACGGAGCTTTGTGTGTGTAGAGGAATC 
CATGCATGGAGAGATATTGCAGGTGGAAGAATGGAAGT6CATGTAC6CACCCAAACCCAA 
GGTTATGCAAACTTGTAATCTGTTTGATTGCCCCAAGTGGATTGCCATGGAGTG6TCTCA 
GTGCACAGTGACTTGTGGCCGAGGGTTACGGTACCGGGTTGTTCTGTGTATTAACCACCG 
CGGAGAGCATGTTGGGGGCTGCAATCCACAACTGAAGTTACACATCAAAGAAGAATGTGT 
CATTCCCATCCCGTGTTATAAACCAAAAGAAAAAA6TCCAGTGGAAGCAAAATTGCCTTG 
GCTGAAACAAGCACy^GAACTAGAAGAGACCAGAATAGCaACAGAAGAACCAA^^ 
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TCCAGa^CCCTGGTCMCCTGCAGTACCACGTGTGGGCCGGGTGTGCAGGTCCGTGAGGT 
GAAGTGCCGTGTGCTCCrmCATTCACGCAGACPGAGACrGAGCTGCCCGAGGJi^^ 
TGAAGGCCCCAAGCTGCCCACCGaACGGCCCTGCCTCCTGGAAGCATGTGATGAGAGCCC 
GGCCTCCCGAGAGCTiWSACATCCCTCTCCCTGAGGACAGTGAGACGACTTAC^ 
5 GTACGCTGGGTTCACCCCTTGCACAGCyVACATGCGTGGGAGGCCATCAAG 

AGTGTGCTTACATATCCAGACCCAGCAGACAGTCAATGACAGCTTGTGTGATATGGTCCA 
CCGTCCTCCAGCCATGAGCCAGGCCTGTAACACAGAGCCCTGTCCCCCCAGGTGGCATGT 
GGGCTOTTGGGGGCCCTGCTCAGCTACCTGTGGAGTTGGAATTCaiGACCC^ 
CTGCCTGCACCCAGGGGAGACCCCTGCCCCTCCTGAGGAGTGCCGAGATGAAAAGCCC^ 

10 TGCTTTACa^CATGCAATCAGTTTGACTGCCCTCCrPGGC^GGCaC^^ 

GCAGTGTTCCAGGACTTGTGGCGGGGGAACTCA6AACAGAAGAGTCACCTGTCGGCAGCT 
GCTAACGGATGGCAGCTTTTTGAATCTCTCAGATGAATTGTGCCAAGGACCCAAGGCATC 
GTCTCACaAGTCCTGTGCCAGGACaVGACTGTCCTCCAGATTTAGCTGTGGGAGACT(^^ 
GAAGTGTTCTCTCAGOTGTGGTGTTGGAATCCAGAGaAGAAAGCAGGTGTGTCAAAGG 

1 5 GGCaGCGAAAGGTOSGCGCATCCCCCTCAGTGAGATGATGTGCAGGGATCrrACCAGGGTT 
CCCTCTTGTAAGATCTTGCCAGATGCCTGAGTGCAGTAAAATCAAATCAGAGATGAAGAC 
AAAACTTGGTGAGCAGGGTCCGCAGATCCTCAGTGTCCAGAfiAGTCTACATTCAGACAA^ 
GGAAGAGAAGCGTATTAACCTGACCATTGGTAGCAGAGCCTATTTGCTGCCCAACACATC 
CGTGATTATTAAGTGCCCCGTGCGACGATTCCAGAAATCTCTGATCCAGTGGGAGAAGGA 

20 TGGCCGTTGCCPGCAGAACTCCAAACGGCTTGGCATCy^CXay^GTCa 

CCACGGTCTTGCTGCCCCCGACATCGGCGTGTACCGGTGCATTGCAGGCTCTGCACAGGA 
AACAGTTGTGCTCAAGCTCATTGGTACTGACAACCGGCTCATCGCACGCCCAGCCCTm 
GGAGCCTATGAGGGAATATCCTGGGATGGACCACAGCGAAGCCAATAGTTTGGGAGTC^^ 
ATGGCACAAAATGAGGCAAATGTGGAATAACAAAAATGACCIT?TATOTGGATGATG^^ 

25 CATTAGTAACCAGCCTTTCTTGAGAGCTCTGTTAGGCCACTGCAGCAATTCTGCAGGAAG 
CACCAACTCCTGGGAGTTGAAGAATAAGa«3TTTGAAGCAGCAGTTAAACAAGGA 
TAGCATGGATACAGCCCaGTTTGATGAGCTGATAAGAAACATGAGTCAGCTCATGGAAAC 
CGGAGAGGTCAGCGATGATCTTGCGTCCCAGCTGATATATCAGCTGGTGGCCGAATTAGC 
CAAGGCACAGCCAACaCACaTGCAGTGGCGGGGCATCraGGAAGAGAGACCT^ 

30 TCAGePCAGAGGGGAAACAGGGAGTGTGTCCCTVAAGCTCGCATGCAAAAAACTCAGGC^ 
GCTGACATTCAAGCCGAAAGGACCTGTTCrCATGAGGCaAAGCCAACCTCCCTCAAOT^ 
ATTTAATAAAAO^TAAATTCCAGGATTGGAAATACaGTATACy^TTACAAAAAGGAC^^ 
GGTCATCAATATACTGTGTGACCTTATTACCCCCAGTGAGGCCACATATACATGGACC^ 
GGATGGAACCrTGTTACAGCCCTCAGTAAAAATAATTTTGGATGGAACTGGGAAGATACA 

35 GATACAGTUVTCCTACAAGGAAAGAACaiAGGCATATATGAATGTTCTGTAGCTAATC^ 

TGGTTCAGATGTGGAAAGTTCTTCTGTGCTGTATGCAGAGGCa^CCTGTCATCTTGTCTGT 
TGAAAGT^TATCyvCCAAACCAGAGCACaACCATCTGTCTGTTGTGGTTGGAGGCATCGT 
GGAGGCAGCCCTTGGAGO^AACGTGACAATCCGATGTCCTGTAAAAGGTGTCCCTCAGCC 
TAATATAACTTGGTTGAAGAGAGGAGGATCTCTGAGTGGCAATGTTTCCTTGCTTTTCAA 

40 TGGATCCCTGTTGTTGCAGAATGTTTCCCrrGAAAATGAAGGAACCTACGTCTGCATAGC 
CTVCCAATGCTCTTGGAAAGGCAGTGGCaACATCTGTACTCCACTTGCTGGAACGAAGAT 
GCCAGAGAGTAGAATCGTATTTCTGCAAGGACATAAAAAGTACATTCTCCAGGCAACCAA 
CACTAGAACCAACAGCAATGACCCAACAGGAGAACCCCCGCCTCAAGAGCCT^ 
GCCTGGTAACTGGTGACATTGTTCTGCa^CCTGTGGTCATTTGGGAGCCCGCyVTTC^ 

45 ACCCCAGTGTGTGATGGCCAATGGGCaWSGAAGTGAGTGAGGCCOTGTGTGATavCCTCC^ 
GAAGCCACTGGCTGGGTTTGAGCCCTGTAACATCCGGGACTGCCCAGCGAGGTGGTTCAC 
AAGTGTGTGGTCACAGTGCTCTGTGTCTTGCGGTGAAGGATACCACAGTCGGCAGGTGAC 
GTGCAAGCGGACAAAAGCCAATGGAACTGTGCAGGTGGTGTCTCCAAGAGCATGTGCCCC 
TAAAGACCGGCCTCTGGGAAGAAAACCATGTTTTGGTCATCCATGTGTTCAGTGGGAACC 

50 AGGGAACCGGTGTCCTGGACGTTGCATGGGCCGTGCTGTGAG6ATGCAGCAGCGTCACAC 
AGCTTGTCaACACAACAGCTCTGACTCCAACrrGTGATGACAGAAAGAGA 
AAGGAACTGCACATCAGGGGCCTGTGATGTGTGTTGGCACACAGGCCCTTG6AAGCCCTG 
TACaGCAGCCTGTGGCyVGGGGTTTCCAGTCTCGGAAAGTCGACTGTATCCACaCAAGG^ 
TTGCAAACCTGTGGCCa^GAGACACTGTGTACAGAAAAAGAAACCAATTTCCrrGG^ 

55 CTGTCrTTGGGCCCTCCTGTGATAGAGACTGCACAGACACAACTCa^CTACTGTATG 

AAAACATCTTAATTTGTGTTCTCTAGACCGCTACAAACAAAGGTGCTGCCAGTCATGTCA 
AGAGGGATAAACCTTTGGAGGGGTCATGATGCTGCTGTGAAGATAAAAGTAGAATATAAA 
AGCTCTTTTCCCCATGTCGCTGATTCAAAAACATGTATTTCTTAAAAGACT 
GGATCAAACAGAGGTTGATGCAAAAACACCACTGTTAAGGTGTAAAGTGAAATTTTCCaA 

60 TGGTAGTTTTATATTCCAATTTTTTAAAATGATGTATTCAAG6ATGAACAAAATACTATA 
GCATGCaTGCCACTGCACTTGGGACCTCATCAT6TCAGTTGAATCGAGAAATCACC3A 
TTATGAGTGCATCCTCACGTGCTGCCTCTTTCCTGTGATATGTAGACTAGCACAGAGTG6 
TACATCCTAAAAACTTGGGAAACa^CAGCAACCCATGACTTCCTCTTCTCTCAAGTO 
GTTTTOiACAGTTTTATAAGGTATTTGCATTTTAGAAGCTCTGGCCAGTAGTTGTTAAGA 

65 TGTTGGCATTAATGGCATTTTCATAGATCCTTGGTTTAGTCTGTGAAAAAGAAACCaTCT 
CTCTGGATAGGCTGTCACACTGACTGACCTAAGGGTTCATGGAAGCATGGCATCTTGTCC 
TTGCTTTTAGAACyVCCCATGGAAGAAAACACaiGAGTAGATATTGCTGTCATTTATACy^ 
TACAGAAATTTATCTATGACCTAATGAGGCATCTCGGAAGTCAAAGAAGAGGGAAAGTTA 
ACCTTTTCTACTGATTTCGTAGTATATTCAGAGCrrrCTTTTAAGAGCrrGTGAAT^^ 

70 TTTTTCTAAGCACTATTCTATTGCACaVCAAACAGAAAACCAAAGCCTTATTAGACCTAAT 
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TTATGCATiUVAGTAGTATTCCPGAGAACTTTATTTTGGAAAATTTATAAGAAAGTAATC^ 
JUUVTAAGAAAOVCGATAGTTGAAAATAATTTTTATAGTAAATAATTGTTTO 
TTTCAGTAAATCCAAAGTGACTTAGGTTAGAAGTTACACTAAGGACCAGGGGTTGGAATC 
AGAATTTAGTTTAAGATTTGAGGAAAAGGGTAAGGGTTAGTTTCAGTTTTAGGATTAGAG " 
5 CTAGAATTGGGTTAGGTGAGAAAGAAAGTTAAGGTTAAGGCTAGAGTTGTCTTTAAGGGT 
TAGGGTTAGGACCAGGTTAGGTC^USGGTTGGATTGGGTTTAfiATTGGGGCCAGTGCT 
GTTAGTGATAGTGTCAGGATGGAGGTTAGGTTTGGAGTAAGCGTTGTTGCTGAAGTGAGT 
TCAGGCTAGCATTAAATTGTAAGTTCTGAAGCTGATTTGGTTATGGGGTCTTTCCCCTGT 
ATACTACCAGTTGTGTCPTTAGATGGCACACaAGTCCAAATAAGTGGTCATACTTCTTTA 

10 TTCAGGGTCTCAGCTGCCTGTACTVCCTGCTGCCTACATCTTCTTGGCAAC^^ 

GCCACAGGCTCTGCTGAGCCTAGTTCCTGGTCAGTAATAACTGAACAfiTGCATTTTGGCT 
TTGGATGTGTCTGTGGACAAGCPTGCTGA6TTTCTCTACCATATTCTGAGCA 
CTTTTGTTCTAATTTCAGCTTCACTGACACTGGGTTGAGCACTACTGTATGTGGAG^ 
TGGTGATTGGGAATGGATGGGGGACAGTGAGGAGGACACACCAGCCCATTAGTTGTTAAT 

15 CMCaVATCACATCTGATTGTTGAAGGTTATTAAATTAAAAGAAAGATCATOT 

CTCTTTGTATATATTTATTATATGAAAGGTGCAA'rATTTTATTTTGTACAGTATGTAATA 
AAGACyVTGGGACaiTATATTTTTCTTATTAACAAAATTTCATATTAAATTGCTTttCTTTG 
TATTTAAAGTTAAAAGTTACTATTTTTCATTTGCTATTGTACTTTCATTGTTGTCATTCA 
ATTGACATTCCTGTGTACTGTATTTTACTACTGTTTTTATAACATGAGAGTTAATGTTTC 

20 TGTTTCATGATCCTTATGTAATTmGAAATAAATTTACTTTGATTATTCAGTGGCATCCT 

TAT (SEQ ID NO: 57) 



MPYDHFQPLPRWEHNPWTACSVSCGGGIQRRSEVCVEESMHGEIIiQVEEWKCmAPKPKVMQTCNLFEX^PECW 
WSQCTVTCGRGLRYRVVLCINHRGEHVGGCNPQLKLHIKEECVIPIPCYKPBCEKSPVEAKLPWLK^^ 

25 TEEPTFIPEPWSACSTTCGPGVQVREVKCRVLLTFTQTETELPEEECEGPKLPTERPCLLEACDESPASRELDIPL 
PEDSETTYDWEYAGFTPCTATCVGGHQEAIAVCLHIQTQQTVNDSLCDMVHRPPAMSQACNTEPCPPRWHVGSWGP 
CSATCGVGIQTRDVYCLHPGETPAPPEECRDEKPHALQACNQFDCPPGWHIEEWQQCSRTCGGGTQNRRVTCRQLL 
TDGSFLNLSDELCQGPECASSHKSCARTDCPPHIAVGDWSKCSVSCGVGIQRRKQVCQRIAAK 
PGFPLVRSCQMPECSKIKSEMKTKLGEQGPQILSVQRVYIQTREEKRINLTIGSRAYLLPNTSVIIKCPVRRFQKS 

30 LIQWEKDGRCLQNSKRLGITKSGSLKIHGLAAPDIGVYRCIAGSAQETVVLKLIGTDNRLIARPALREPb^ 
DHSEANSLGVTWHKMRQMWNNKNDLYLDDDHISNQPFLRALLGHCSNSAGSTNSWELKK^ 
QFDELIRNMSQLMETGEVSDDLASQLIYQLVAELAKAQPTHMQWRGIQEETPPAAQLRGETGSVSQSSHTVK^ 
TFKPKGPVLMRQSQPPSISFNKTINSRIGNTVYITKRTEVINILCDLITPSEATYTWTKDGTLLQPSVra 
KIQIQNPTRKEQGIYECSVANHLGSDVESSSVLYAEAPVILSVERNITKPEHNHLSVVVGGIVEAALGANVTIRCP 

35 VKGVPQPNITWLKRGGSLSGNVSLLFNGSLLMNVSLENEGTYVCIATNALGKAVATSVLHLLERR^ 

GHECKYILQATNTRTNSNDPTGEPPPQEPFraPGNWSHCSATCGHLGARIQRPQCVMANGQEVSEALCDHLQECPIA^ 
FEPCNIRDCPARWFTSVWSQCSVSCGEGYHSRQVTCECRTKANGTVQWSPRACAPKDRPLGRKPCFGHPCVQWEPG 
NRCPGRCMGRAVRMQQRHTACQHNSSDSNCDDRKRPTLRRNCTSGACDVCWHTGPI^PCTAACGRGFQSRK^ 
TRSCKPVAKRHCVQKKKPISWRHCLGPSCDRDCTDTTHYCMFVKHLNLCSLDRYKQRCCQSCQEG (SEQ ID 

40 NO: 5) 



Table 9 shows a multiple sequence afignment of NOV-1, N0V-2a, and N0V-2b 
polypeptides with a KIAA1233 protein (GenBank Accession No: BAA86547), that 
demonstrates the homology between disclosed sequences according to the invention and a 
45 known member of the protein family. 
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KIAA1233 

NOVl 

N0V2b 

N0V2a MASWTSPWWVLIGMVmHSPLPQTTAEKSPGAYFLPEFALSPQGSFLEDTTGEQFLTYRY 

KIAA1233 

NOVl 

N0V2b 

N0V2a DDQTSRNTRSDEDKDGNWDAWGDWSDCSRTCGGGASYSLRRCLTGRNCEGQNIRYKTCSN 

KIAA1233 

NOVl 

N0V2b 

N0V2a HDCPPDAEDFRAQQCSAYNDVQYQGHYYEWLPRYNDPAAPCALKCHAQGQNLVVELAPKV 

KIAA1233 

NOVl 

N0V2b 

N0V2a LDGTRCNTDSLDMCISGICQAVGCDRQLGSNAKEDNCGVCAGDGSTCRLVRGQSKSHVSP 

KIAA1233 T 

NOVl 

N0V2b 

N0V2a EKREENVIAVPLGSRSVRITVKGPAHLFIESKTLQGSKGEHSFNSPGVFWENTTVEFQR 

KIAA1233 

NOVl 

N0V2b 

N0V2a GSERQTFKIPGPLMADFIFKTRYTAAKDSWQFFFYQPISHQWRQTDFFPCTVTCGGGYQ 

KIAA1233 

uovi MPYDHFQPLP 

N0V2b MPYDHFQPLP 

N0V2a LNSAECVDIRLKRWPDHYCHYYPENVKPKPKLKECSMDPCPSSDGFKEIMPYDHFQPLP 

KIAA1233 

NOVl RWEHNPWTACSVSCGGGIQRRSFVCVEESMHGEILQVEEWKCMYAPKPKVMQTCNLFDCP 
N0V2b RWEHNPWTACSVSCGGGIQRRSFVCVEESbfflGEILQVEEWKCMYAPKPKVMQTCNLFDCP 
N0V2a RWEHNPWTACSVSCGGGIQRRSFVCVEESMHGEILQVEEWKCMYAPKPKVMQTCNLFDCP 

KIAA1233 

NOVl KWIAMEWSQCTVTCGRGLRYRVVLCINHRGEHVGGCNPQLKLHIKEECVIPrPCYKPBCEK 

N0V2b KWIAMEWSQCTVTCGRGLRYRVVLCINHRGEHVGGCNPQLKLHIKEECVIPIPCYKPKEK 

N0V2a KWIAMEWSQCTVTCGRGLRYRVVLCINHRGEHVGGCNPQLBCLHIKEECVIPIPCYKPKEK 

K1AA1233 

NOVl SPVEAKLPWLKQAQELEETRIATEEPTFIPEPWSACSTTC6PGVQVREVKCRVLLTFTQT 
N0V2b SPVEAKLPWLKQAQELEETRIATEEPTFIPEPWSACSTTCGPGVQVREVKCRVLLTFTQT 
Nav2a SPVEAKLPWLKQAQELEETRIATEEPTFIPEPWSACSTTCGPGVQVREVKCRVLLTFTQT 

KIAA1233 

NOVl ETELPEEECEGPKLPTERPCLLEACDESPASRELDIPLPEDSETTYDWEYAGFTPCTATC 
N0V2b ETELPEEECEGPKLPTERPCLLEACDESPASRELDIPLPEDSETTYDWEYAGFTPCTATC 
N0V2a ETELPEEECEGPKLPTERPCLLEACDESPASRELDIPLPEDSETTYDWEYAGFTPCTATC 



KIAA1233 AVCLHIQTQQTVNDSLCDMVHRPPAMSQACNTEPCPPRWHVGSWGPCSATCG 

NOVl VGGHQEAIAVCLHIQTQQTVNDSLCDMVHRPPAMSQACNTEPCPPRWHVGSWGPCSATCG 
N0V2b VGGHQEAIAVCLHIQTQQTVNDSLCDMVHRPPAMSQACNTEPCPPRWHVGSWGPCSATCG 
N0V2a VGGHQEAIAVCLHIQTQQTVNDSLCDMVHRPPAMSQACNTEPCPPRWHVGSWGPCSATCG 

gQ ************************ ******* 

KIAA1233 VGIQTRDVYCLHPGETPAPPEECRDEKPHALQACNQFDCPPGWHIEEWQQCSRTCGGGTQ 
NOVl VGIQTRDVYCLHPGETPAPPEECRDEKPHALOACNQFDCPPGWHIEEWQQCSRTCGGGTQ * 
N0V2b VGIQTRDVYCLHPGETPAPPEECRDEKPHALQACNQFDCPPGWHIEEWQQCSRTCGGGTQ 

38 



wo 01/62928 PCT/DSOl/06151 

N0V2a VGIQTRDVYCLHPGETPAPPEECRDEKPHALOACNQFDCPPGWHIEEWQQCSRTCGGGTQ 
************************************************************ 

KIflAl233 NRRVTCRQLLTDGSFLNLSDELCQGPKZ^SHKSCARTDCPPHLAVGDWSKCSVSCGVGIQ 
5 NOVl NRRVTCRQLLTDGSFLNLSDELCQGPKASSHKSCARTDCPPHLAVGDWSKCSVSCGVGIQ 
N0V2b NRRVTCRQLLTDGSFLNLSDELCQGPKASSHKSCARTDCPPHLAVGDWSKCSVSCGVGIQ 
N0V2a NRRVTCRQLLTDGSFLKLSDELCQGPKASSHKSCARTDCPPHLAVGDWSKCSVSCGVGIQ 
************************************************************ 

10 KIAfll233 RRKQVCQRIAAKGRRIPLSEMMCRDLPGLPLVRSCQMPECSKIKSEMKTKLGEQGPQILS 
NOVl RRKQVCQRLAAKGRRIPLSEMMCRDLPGLPLVRSCQMPECSKIKSEMICPKLGEQGPQILS 
N0V2b RRKQVCQRLAAKGRRIPLSEmCRDLPGFPLVRSCQMPECSKIKSEI^KLGEQGPQILS 
N0V2a RRKQVCQRIAAKGRRIPLSEMMCRDLPGFPLVRSCQMPECSKIKSEMKTKLGEQGPQILS 
****************************:******************************* 

15 

KIAA1233 VQRVYIQTREEKRINLTIGSRAYLLPNTSVIIKCPVRRFQKSLIQWEKDGRCLQNSKRLG 
NOVl VQRVYIQTREEKRINLTIGSRAYLLPNTSVIIKCPVRRFQKSLIQWEKDGRCLQNSKRLG 
N0V2b VQRVYIQTREEKRINLTIGSRAYLLPNTSVIIKCPVRRFQKSLIQWEKDGRCLQNSKRLG 
N0V2a VQRVYIQTREEKRINLTIGSRAYLIiPNTSVIIKCPVRRFQKSLIQVraKDGRCLQNSKRLG 

20 ************************************************************ 

KIAA1233 ITKSGSLKIHGLAAPDIGVYRCIAGSAQETWLKLIGTDNRLIARPALREPMREYPGMDH 
NOVl ITKSGSLKIHGLAAPDIGVYRCIAGSAQETWLKLIGTDNRLIARPALREPMREYPGMDH 
N0V2b ITKSGSLKIHGLAAPDIGVYRCIAGSAQETWLKLIGTDNRLIARPALREPMREYPGMDH 
25 NOV2a ITKSGSLKIHGLAAPDIGVYRCIAGSAQETWLKLIGTDNRLIARPALREPMREYPGMDH 
************************************************************ 



KIAA1233 
NOVl SEANSI 



SEANSLGVTWHKMRQMWiraKNDLYLDDDHISNQPFLRALLGHCSNSAGSTNSWELKNK^ 
NOVl SEANSLGVTWHKmQMWNNKNDLYLDDDHISNQPFLRALLGHCSNSAGSTNSWELKNKQF 
30 N0V2b SEANSLGVTWHKMRQMWNNKNDLYLDDDHISNQPFLRALLGHCSNSAGSTNSWELKNKQF 
N0V2a SEANSLGVTWHKMRQMWNNKNDLYLDDDHISNQPFLRALLGHCSNSAGSTNSWELKNKQF 
*********************************************************** 

KIAA1233 ETUVVKQGAYSMDTAQFDEIilRNMSQLMETGEVSDDLASQLIYQLVAELAKAQPTHMQWRG 
35 NOVl EAAVKQGAYSMDTAQFDELIRmSQLMETGEVSDDLASQLIYQLVAEIiAKAQPTHMQVfRG 
N0V2b EAAVKQGAYSMDTAQFDELIRNMSQLMETGEVSDDIiASQLIYQLVAELAKAQPTHMQWRG 
• NOV2a EAAVKQGAYSMDTAQFDELIRNMSQLMETGEVSDDLASQLIYQLVAELAKAQPT^^ 

************************************************************ 

40 KIAA1233 IQEETPPAAQLRGETGSVSQSSHAKNSGKLTFKPKGPVLMRQSQPPSISFNKTINSRIGN 
NOVl IQEETPPAAQLRGETGSVSQSSHAKNSGKLTFKPKGPVLMRQSQPPSISFNKTINSRIGN 
N0V2b IQEETPPAAQLRGETGSVSQSSHAKNSGKLTFKPKGPVLbiRQSQPPSISFNKTINSRIGN 
N0V2a IQEETPPAAQLRGETGSVSQSSHAKNSGKLTFKPKGPVLMRQSQPPSISFNKTINSRIGN 
************************************************************ 

45 

KIAA1233 TVYITECRTEVINILCDLITPSEATYTWTKDGTLLQPSVKIILDGTGKIQIQNPTRKEQGI 
NOVl TVYITKRTEVINILCDLITPSEATYTWTKDGTLLQPSVKIILDGTGKIQIQNPTRKEQGI 
N0V2b TVYITKRTEVINXLCDLITPSEATYTWTKDGTLLQPSVKIILDGTGKIQIQNPTRKEQGI 
N0V2a TVYITKRTEVINlIiCDLITPSEATYTWTKDGTLLQPSVKIILDGTGKIQIQNPTRKEQGI 
50 ************************************************************ 

K1AA1233 YECSVANHLGSDVESSSVLYAEAPVILSVERNITKPEHNHLSWVGGIVEAALGANVTIR 
NOVl YECSVANHLGSDVESSSVLYAEAPVILSVERNITKPEHNHLSVWGGIVEAALGANVTIR 
N0V2b YECSVANHLGSDVESSSVLYAEAPVILSVERNITKPEHNHLSWVGGIVEAALGANVTIR 
55 N0V2a YECSVANHLGSDVESSSVLYAET^VILSVERNITKPEHNHLSWVGGIVEAALGANVTIR 
************************************************************ 

KIAA1233 CPVKGVPQPNITWLKRGGSLSGNVSLLFNGSLLLQNVSLENEGTYVCIATNALGKAVATS 
NOVl CPVKGVPQPNITWLKRGGSLSGNVSLLFNGSLLLQNVSLENEGTYVCIATNALGKAVATS 
60 N0V2b CPVKGVPQPNITWLKRGGSLSGNVSLLFNGSLLLQNVSLENEGTYVCIATNALGKAVATS 
N0V2a CPVKGVPQPNITWLKRGGSLSGNVSLLFNGSLLLQNVSLENEGTYVCIATNALGKAVATS 
************************************************************ 

39 
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KIAA1233 VLHLLEEIRWPESRIVFLQGHKKYILQATNTRTNSNDPTGEPPPQEPFWEPGNWSHCSATC 
NOVl VLHLLERRWPESRIVFLQGHKKYILQATNTRTNSNDPTGEPPPQEPFWEPGNWSHCSATC 
K0V2b VLHLLERRWPESRIVFLQGHBCKyiliQATNTRTNSNDPTGEPPPQEPEWEPGNWSHCSATC 
N0V2a VFHLLERRWPESRIVFLQGHKKYILQATNTRTNSNDPTGEPPPQEPFWEPGNWSHCSATC 

KIAA1233 GHLGARIQRPQCVMANGQEVSEALCDHLQKPLAGFEPCNIRDCPARWFTSVWSQCSVSCG 
NOVl GHLGARIQRPQCVMANGQEVSEALCDHLQKPLAGFEPCNIRDCPARWFTSVWSQCSVSCG 
N0V2b GHLGARIQRPQCVMANGQEVSEALCDHLQKPLAGFEPCNIRDCPARWFTSVWSQCSVSCG 
10 N0V2a GHLGARIQRPQCVMANGQEVSEALCDHLQKPLAGFEPCNIRDCPARWFTSVWSQCSVSCG 
************************************************************ 

KIAA1233 EGYHSRQVTCKRTKANGTVQWSPRACAPKDRPLGRKPCFGHPCVQWEPGNRCPGRCMGR 
NOVl EGYHSRQVTCKRTKANGTVQWSPRACAPKDRPLGRKPCFGHPCVQWEPGNRCPGRCMGR 
15 N0V2b EGYHSRQVTCKRTKANGTVQWSPRACAPKDRPLGRKPCFGHPCVQWEPGNRCPGRCMGR 
N0V2a EGYHSRQVTCKRTKANGTVQWSPRACAPKDRPLGRKPCFGHPCVQWEPGNRCPGRCMGR 
************************************************************ 

KIAA1233 AVRMQQRHTACQHNSSDSNCDDRKRPTLRRNCTSGACDVCWHTGPWKPCTAACGRGFQSR 
20 NOVl AVRMQQRHTACQHNSSDSNCDDRKRPTLRRNCTSGACDVCWHTGPWKPCTAACGRGFQSR 
N0V2b AVRMQQRHTACQHKSSDSNCDDRKRPTLRRNCTSGACDVCWHTGPWKPCTAACGRGFQSR 
N0V2a AVRMQQRHTACQHNSSDSNCDDRKRPTLRRNCTSGACDVCWHTGPWKPCTAACGRGFQSR 
************************************************************ 

25 KIAA1233 kvdcihtrsckpvakrhcvqkkkpiswrhclgpscdrix:tdtthycmfvkhlnlcsldry 

NOVl KVDCIHTRSCKPVAKRHCVQKKKPISWRHCLGPSCDRDCTDTTHYCMFVKHLNLCSLDRY 
.K0V2b KVDCIHTRSCKPVAKRHCVQKECKPISWRHCLGPSCDRDCTDTTHYCMFVKHLNLCSLDRY 
N0V2a KVDCIHTRSCKPVAKRHCVQKKKPISWRHCLGPSCDRDCTDTTHYCMFVKHLNLCSLDRY 
************************************************************ 

30 

KIAA1233 KQRCCQSCQEG (SEQ ID NO: 28) 
NOVl KQRCCQSCQEG (SEQ ID NO: 2) 
N0V2b KQRCCQSCQEG (SEQ ID NO: 5) 
N0V2a KQRCCQSCQEG (SEQ ID NO: 4) 

35 ********** 
Consensus key 

♦ - single, fiilly conserved residue 

: - conservation of strong groiqis 

. - conservation of weak groups - no consensus 

40 

Based the relatedness of the disclosed N0V-2b to the disclosed NOV-1, the disclosed 
N0V-2a, and KIAA1233 sequences, which as noted are related to lacunin, thrombospondins, 
piotemases, sem^horins, ADAM-TS and properdin family members, the nucleic acids and 
proteins of the invention can have similar fimctions as proteins belonging to these families. 
45 Thus, the invention is implicated in the following diseases and processes and has ther^eutic 
uses in these diseases and processes: (i) inflammation, (ii) cancer, (iii) neuronal development 
and axonal guidance, (iv) angiogenesis and vasculogenesis - in cancer as well as for ischemia, 
and (v) tissue regeneration in vivo and in vitro, and (vi) and other diseases and disorders. 

Functional roles attributed to this family of proteins include cell attachment, spreading, 
50 motility, and proliferation, cytoskeletal organization, wound healing, and angiogenesis. 

Moreover, these proteins are expressed in the nervous systems during development and are 
thought to play roles in neuronal growth and patterning. In particular, the fhrombospondin, 

40 
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METH-1 and ADAMTS families of proteins are potent inhibitors of angiogenesis. The 
ADAMTS proteins have also been implicated in cleavage of proteglycans and the control of 
organ shape during development In addition, the thrombospondins have been implicated in 
the activation of both transforming growth factor-beta (TGF-P) precursors and TGF-P in a 
5 variety of disease states. Furthermore, sem^horin proteins have shown expression in 

undrSerentiated neuioepithelium, suggesting that these proteins are actors in axonal guidance. 

The novel nucleic acids of the invention encoding human proteins includes the nucleic 
acids whose sequences are provided as NOV-1, N0V-2a, and N0V-2b, respectively, or 
fiagments thereof. The invention also includes mutant or variant nucldc acids any of whose 

10 bases may be changed fcom the corresponding bases shown as NOV-1, NOV-2a, and NOV- 
2b, while still encoding a protein that maintains its human KIAA1233-like proteros activities 
and physiological functions, or a fiagment of such nucleic acids; The invention further 
includes nucleic acids whose sequences are complementary to those just described, including 
nucleic acid fragments that are conq)lementary to any of the nucleic acids just described. The 

1 5 invention additionally includes nucleic acids or nucleic acid fragments, or complements 

thereto, whose structures mclude chemical modifications. Such modifications include, by way 
of non-limiting example, modified bases, and nucleic acids whose sugar phosphate backbones 
are modified or dedvatized. These modifications are carried out at least in part to enhance the 
chemical stability of the modified nucleic acid, such that they may be used, for example, as 

20 anti-sense binding nucleic acids in ther^eutic applications in a subject. 

The novel proteins of flie invention includes the human KIAA1233-like proteins whose 
sequences are provided as NOV-1 , N0V-2a, and N0V-2b, respectively. The invention also 
includes a mutant or variant protein any of whose residues may be changed fix)m the 
corresponding residues shown as NOV-1, N0V-2a, and N0V-2b, while still encodmg a 

25 protein that maintains its human KIAA1233-like protem activities and physiological 
functions, or a functional fixigment hereof 

The invention further encompasses antibodies and antibody fragments, such as Fab or 
(FabK that bind inamunospecifically to any of the proteins of the invention. 

The expression pattern, and protein similarity information for the invention suggest 

30 that NOV-1, NOV-2a and N0V-2b may function as human KIAA1233-like proteins. 

Therefore, the nucleic acid and protein of the invention are useful in potential ther^eutic 
^plications implicated, for example but not limited to, (i) inflammation, (ii) cancer, (iii) 
neuronal development and axonal guidance, (iv) angiogenesis and vasculogenesis - in cancer 
as well as for ischemia, and (v) tissue regeneration in vivo and in vitro, (vi) and other diseases 
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and disorders. The homology to antigenic secreted and membrane proteins also suggests fliat 
antibodies directed against the novel genes may be useM in treatment and prevention of (i) 
inflammation, (ii) cancer, (iii) neuronal development and axonal guidance, (iv) angiogenesis 
and vasculogenesis - in cancer as well as for ischemia, and (v) tissue regeneration in vivo and 
5 in vitro, and (vi) other diseases and disorders. 

Potential ther^utic uses for the invention(s) are, for example but not liniited to, the 
following: (i) protein ther^eutic, (ii) small molecule drug target, (iii) antibody target 
(ther^eutic, diagnostic, drug targeting/cytotoxic antibody), (iv) diagnostic and/or prognostic 
marker, (v) gene ther^y (gene delivery/gene ablation), (vi) research tools, and (vii) tissue 
10 regeneration in vitro and in vivo (regeneration for all these tissues and cell types composing 
these tissues and cell types derived from these tissues. 

NOV-3: A Novel STE20 Protein Kinase 

The NOV-3 sequences (NOV-3a, NOV-3b, N0V-3c, and NOV-3d) according to the 

1 5 invention are splice variants related to STE20 protein kinases. The differences between the 
four sequences relate to the four ways of independently combining two deletions arising from 
two splice variants in the mRNAs. 

Splice variants are sequences that occur naturally within the cells and tissues of 
individuals. The physiological activity of splice variant products and the original protein, from 

20 which they are varied, may be the same (although perh^s at a different level), opposite, or 
completely different and umelated. In addition, variants may have no activity at all. When a 
variant and tiie original sequence have the same or opposite activity, they may diffa: in various 
properties not directly connected to biological activity, such as stability, clearance rate, tissue 
and cellular localization, temporal pattern of expression, up or down regulation mechanisms, 

25 and responses to agonists or antagonists. The presence or level of specific splice variants inay 
be the cause, and/or indicative o^ a disease, disorder, pathological or normal conditioiL 

Because a drug may be effective against one variant but not another, or may cause side 
effects because it targets all splice variants, an effective drug needs to target the particular 
splice variant Because soluble variants with therapeutic or disease-related functions may be 

30 naturally occurring in specific tissues, they may be optimal candidates for drug targets or 
protein therapeutics. Variants may have no activity at all and may thus serve as dominant 
negative natural inhibitors. Thus, splice variants usefiil in generating new drug targets, protein 
ther^eutics and markers for diagnostics. 
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NOV-3 sequences according to the invention encode polypeptides related to STE20 
protein kinases, whose subgroups include GCK, SLK, and PSK proteins. Therefore, the 
nucleic acids and proteins of the invention can have similar functions as protems belonging to 
these subgroiq)s. 

5 Functional roles attributed to STE20 proteins include cytoskeletal organization, 

q>optosis, and signal transduction pathways. Thus, the NOV-3 nucleic acids and 
polypeptides, antibodies and related compounds according to the invention will be useful in 
ther^eutic and diagnostic ^plications in disorders associated with, e.g,, metabolic and 
endocrine disorders, cancer, bone disorders, and tissue/cell growth regulation disorders. 

10 NOV-3 sequences were initially identified by searching CuraGen's Human SeqCalling 

database for DNA sequences that translate into protems with similarity to the STE20 protein 
kinase family. The SeqCalling assembly for NOV-3 was analyzed fiirttier to identify open 
reading fi:ame(s) encoding for novel full length protein(s) and novel sphce variants of these 
genes. This was done by extending the SeqCalling assembly using additional SeqCalling 

15 assemblies, publicly available EST sequences and public genomic sequence. Public ESTs and 
additional CuraGen SeqCalling assemblies were idratified by die CuraTools program 
SeqExtead. They were included in the DNA sequence extension for SeqCalling assembly 
18552586 when extended overl^s were found. 

SeqCalling is a difierential expression and sequencing procedure that normalizes 

20 mRNA species in a sample, and is disclosed in U.S. Ser. No. 09/417,386 filed October 13, 
1999, which is incorporated herein by reference in its entirety. 

A genomic clone of NOV-3 was analyzed by Genscan™ and Grail™ to identify exons 
and putative coding sequences/open reading firames. The NOV-3 clone was also analyzed by 
TblasfN, BlastX and other homology programs to identify regions translating to proteins with 

25 similarity to die original protein/protein &mily of interest 

The results of these analyses were integrated and manually corrected for apparent 
inconsistencies, thereby obtaining the sequences encoding the full4engthpr^ When 
necessary, the process to identify and analyse cDNAs/ESTs and genomic clones was reiterated 
to derive the full-length sequence. The fidl-length DNA sequences as well as their splice 

30 forms, and the full-length protein sequences that they encode, are disclosed herein. 
NOV-3 was mapped to chromosome 17. 

Based on the CuraGen SeqCalling database information, the NOV-3 is expressed in 
heart tissue. Moreover, based on the e>q)ression of STE-20 family members, the following 
tissues are also likely to express the invention: brain (especially hippocampus and cerebral 
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cx)rtex), prostate, and blood hematopoetic cell lines. The patterns of expression for this gene 
and its family members, combined with its similarity to the STE20 kinase family of genes, 
suggests that flie NOV-3 proteins function as kinases in the tissues of e^tpression. Thus, NOV- 
3 is implicated in disorders involving these tissues. Some of these disorders include: 
5 cardiovascular disorders, diabetes, leukemia/lymphoma, cancer, musculoskeletal disorders, 
muscular generation, reproductive health, metabolic and endocrine disorders, gastrointestinal 
disorders, immime and autoimmune disorders, respiratory disorders, bone disorders, and 
tissue/cell growth regulation disorders. 

Additional utilities for NOV-3 nucleic adds andpolypq)tides according to the 
1 0 invention are also disclosed herein. 

NOV-3a 

A N0V-3a sequence according to the invention is a nucleic add sequence encoding a 
polypeptide related to STE20 femily of protein kinases. A disclosed N0V"3a nucldc acid and 
IS its encoded polypeptide includes the sequences shown in Table 10. The disclosed nucldc add 
(SEQ ID NO: 6) is 3999 nucleotides in length and contains an open reading fiame (ORF) that 
begins with an ATG initiation codon at nucleotides 1-3 and ends with a TGA stop codon at 
nucleotides 3996-3999. The start and stop codons are shown in bold font. The respective 
ORF encodes a 1332 amino acid polypeptide (SEQ ID NO: 7). 
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TABLE 10 



ATGGGCGACCCAGCCCCCGCCCGCAGCCTGGACGACATCGACCTGTCCGCCCTGCGGGACCCT 

TGTGGAGGTGGTCGGCyiATGGAACCTACGGACAGGTGTACAAGGGTCGX3CATGT(^^ 

AGGTCATGGATGTOVCGGAGGACGAGGAGGAAGAGATCAAAaiGGAGATCAACATGCT 

25 AACATCGCCACCTACTACGGAGCCTTCATCAAGAAGRGCCCCCCGGGAAACGATGACCAGCTCTGGCTGGTGATG^ 
CrGTGGTGCTGGTTCAGTGACTGACCTGGTAAAGAACACAAAAGGO^CGCC^^ 
GCAGGGAGATCCTCAGGGGTCTGGCCCATCTCCATGCCCACAAGGTGATCCATCGAGA 
CTGAOVGAGAATGCTGAGGTCAAGCTAGTGGATTTTGGGGTGAGTGCTCAGCTGGACCGCACCGTGGGCAG^ 
TTTCATTGGGACTCCCTACTGGATGGCTCCAGAGGTCATCGCCTGTGATGAGAA 

30 GTGATATTTGGTCTCTAGGAATCACAGCCATCGAGATGGCAGAGGGAGCCCCCCCTCTGTGTGAC^^ 
GCCCTCTTCCTCATTCCTCGGAACCCTCCGCCCaGGCTCAAGTCCAAGAAGTGGTCTAAGAAGTTC^ 
CaGATGTCTCATCAAGACTTACCTGAGCCGCCCACCCACGGAGCAGCTACTGAAGTTTCCCTTC^^ 
CGGAGCGGCAGGTCCGCATCCyW5CTTAAGGACCACATTGACCGATCCCGGAAGAAGCG6GGTG 
TATGAGTACAGCGGCAGCGAGGAGGflAGATGaCAGCCATGGAGAGGAAGGAGAGCCAAGCTCCATCT^ 

35 AGAGTCGACTCTACGCCGGGAGTTTCTCCGGCTCCAGCAGGAAAATAAGAGCAACTCAGAGGCT^ 
AGCTGCAGCAGCAGCAGCAGCGAGACCCCGAGGCACACATCRAACACCTGCTGCA 
CAGAAGGAGGAGCGGCGCCGCGTGGAGGAGCAACAGCGGCGGGAGCGGGAGCaGCGGAA^^ 

GCGGCGGCTGGAGGACATGCAGGCTCTGCGGCGGGAGGAGGAGCGGCGGCAGGCGGAGCGCGAGCAGGAATATATTCGTC 
ACAGGCTAGAGGAGGAGCAGCGACAGCTCGAGATCCTTCAGCAACAGCTGCTCCAGGAACAGeCCCPGCTG^ 
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AAGOSGAAGCAGCTGGAGGAGCAGCGGCAGTCAGAACGTCTCCavGAGGCA^ 
CCTGCAGOiGCAGCaUVCAGCAfiCAGCAGCTTCA^^ 

ACCATTATGGTCGGGGCaTGAATCCCGCTGaCMACCAGCCTGGGCCCGAGAGGTAGflA 
CAGCAGi^CTCTCCCrTGGCCAAGAGCAAGCa«5GCaGC7^CGGGGC 
5 CCCAGGACCCCTTTCCCAGACTCCTCCTATGCAGAGGCCGGTGGAGCCCCAGGAGGGA^ 

ACCGGGTCCmCTGAAGCCATATGCAGCACCTGTACCCCGATCCCaGTCCCTGCaGGACCAGCCra^ 
GCCTTCCCAGCCTCCCATGACCCCGACCCTGCCATCCCaXIACCCACTGCCACGCCC^^ 
CCAGflATTCAGACCCCACCTCTGAAGGACCTGGCCCraGCCCGAATCCCCCAGCCTGGGTCCGCCC^^ 
CACCCAAGGTGCCTOVGAGGACCTCATCrrATCGCCACTGCCCTTAACACCaGTGGG^ 
10 GCAGTCOSTGCCAGTAACCCCGACCTCAGGAGGAGaaACCCTGGCTGGGAACGCTCGGAC^^ 
CGGGCACCTCCCCCAGGCTGGCTCaCTGGAGCGGAACCGCGTGGGAGTCTCCTCC^^ 
CCCCTGGGAATAAAGCCAAGCCCGACGACmCCGCTCACGGCCAGGCaSGCCCGC^^ 
GACTTTGTGTTGCTGAAAGAGOSGACTCTGGACGAGGCCCCTCGGCCT 

CGAGGAGGTGGAAAGCAGTGAGGACGACGAGGAGGAAGGCGAAGGCGGGCCAGCAGAGGGGAGCAGAGATACCCC^ 

1 5 GCCGCAfiCGATGGGGATAmGACAGCGTCAGaVCCATGGTGGTCCACGACGTCGAGm 

TACGGGGGCm::ACCy^TGGTGGTCCAGCGCACCCCTGAAGAGGAGCGGAACCTOCTGCATGCT 
AAACCTGCCTGACGTGGTCCAGCCCAGCCACTCaCCCACCGAGAACAGCAAAGGCCM^ 
GTGGTGACTACCAGTCTOSTGGGCTGGTAAAGGCCCCrGGCAAGAGCTCGTTCACGATGTTTGTGGATCTAGGGA 
CAGCCTGGAGGCAGTGGGGACAGCATCCCCATCACAGCCCTAGTGGGTGGAGAGGGCACTCGGCT^^ 

20 CGACGTGAGGAAGGGTTCTGTGGTCAACGTGAATCCCACa^CACCCGGGCCCACAGTGAGACCOT 
ACAAGAAGOsATTCAACTCCGAGATCCTCTGTGCAGCCCTTTGGGGGGTC^ 
ATGTTGCTGGACCGAAGTGGGCAGGGCAAGGTGTATGGACTCATrGGGCGGCGACGCTTCCAGC^ 
GGGGCTCy\ACCTGCTCATCACCATCTCAGGGAAAAGGAACAAACTGCGGGTGTATTACCT 
TTCTGCAOyVTGACCCaVGAAGTGGAGAAGAAGCAGGGCTGGACCACCGTGGGGGAC^^ 

25 GTTGTGAAATACGAGCGGATTAAGTTCCTGGTCATCGCCCTCAAGAGCTCCGTGGAGGTGTATGCCTGGGCCCCCAAACC 
CTACeACAAATTCATGGCCTTCAAGTCCTTTGCCGACCTC(XCCACCGCCCTCT 
GGCAGCGGCTCaAGGTCATCTATGGCTCCAGTGCTGGCTTCCATGCTGTGGATGTCGACT 
TACATCCCTGTGCACATCCAGAGCCAGATCACGCCCCaTGCCATCATCTTCCTCCCC^ 
GCTGTGCTACGAGGACGAGGGTGTCTACGTCAACACGTACGGGCGCATCATTAAGGATGTGGTGOT 

30 TGCCTACTTCTGTGGCCTACaTCTGCTCCaACCAGATAATGGGCTGGGGTGAGAAAGC^ 
ACGGGCCACCTCGACGGGGTCTTCS^TGCACauyiCGAGCTCAGAGGCTCAAG^ 

TTTTGCCTCAGTCCGCTCTGGGGGCAGCAGCCAAGTTTACTTCATGACTCTGAACCGTAACTGCATm^ 

(SEQIDN0:6) 

35 MGDPAPARSLDDIDLSALRDPAGIFELVEWGKGTyGQVYKGRHVKTGQLAAIECVMDVTEDEEEEIK^^ 
NIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSWDLVKNTKGNALKEDCIAYIO^II^ 
LTENAEVKIiTOFGVSAQLDRWGRRNTFIGTPYWMZVPEVIACDENPDATYDYRSDIWSLGITAIEMAEGM 
ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQIJ^FPFIRDQPTERQVRIQI^^ 
YEYSGSEEEDDSHGEEGEPSSIMNVPGESTLRREFXiRLQQENKSNSEALKQQQQLQQQQQRDPEAHIJ^^ 

40 QKEERRRVEEQQRREREQRKLQEKEQQRRI^DMOALRREEERRQAEREQEyiRHRLEEEQRQI£ 
iCRKQI£EQRQSERLQRQLQQEHAYIJCSLQQQQQQQQLQKQQQQQLLPGDRKPLYHYGRGMN^^ 

QQNSPLAKSKPGSTGPEPPIPQASPGPPGPLSQTPPMQRPVEPQEGPHKSLVAHRVPLKPYAAPVPRSQSLQDQPTRN^ 
AFPASHDPDPAIPAPTATPSARGAVIRQNSDPTSEGPGPSPNPPAOTRPDNEAPPKVPQRTSSIATALNTSGAGGSRPAQ 
AVRASNPDIiRRSDPGWERSDSVLPASHGHLPQAGSLERNRVGVSSKPDSSPVLSPGNKAKPDDHRSRPGRP^ 
45 DFVLIJ^RTLDEAPRPPKKAMDYSSSSEEVESSEDDEEEGEGGPAEGSRDTPGGRSDGDTDSVSTM^ 
YGGGTMWQRTPEEERNLLHADSNGYTNLPDVVQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGK^ 
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QPGGSGDSIPITALVGGEGTRLDQLQYDVRKGSVVNVNPTNTRAHSETPEIRKYKKRFN 
MLLDRSGQGKSnfGLIGRRRPQQMDVLEGlJJLLITISGKRNKiaVYYLSW 
VVKYERIKFLVIALKSSVEVYAWAPKPYHKmiFECSFADLPHRPLLVDLT^ 
YIPVHIQSQITPHAIIFLPNTDGMEmLCYEDEGVYVmrfGRIIKDVVLQWGE^ 
5 TGHLDGVETfflKRAQRIJCFIiCEa««DKVFFASVRSGGSSQVYF^ (SEQ ID NO: 7) 

The disclosed N0V-3a nucleic acid sequence has homology (73% identity) to a mouse 
mRNA for a NIK protein (NIK) (GenBank Accession No: MMU88984), as shown in Table 
1 1 . NIK proteins are a subgroiq) of the STE20 family of protein kinases. As indicated by the 

10 *T&q)ect" value, the probability of this alignment occurring by chance alone is 4.3e-298, which 
is an incredibly low probability score. Moreover, the disclosed, encoded amino acid sequeace 
has 1095 of 1332 amino acid residues (82%) identical to a human NIK-related protein 
(GenBank Accession No: BAA90753), as shown in Table 12, As indicated by the **Expecf' 
value, the probability of this alignment occurring by chance alone is 0, the lowest probability 

15 score. 



TABLE 11 

Score = 3892 {584.0 bits). Expect 4.3e-298, Sum P(2) = 4.3e-298 
Identities * 1224/1657 (73%), Positives = 1224/1657 (73%), Strand = Plus / 
20 Plus 



N0V3a: 4 GGCGACCCAGCC-CCCGCCCGCAGCCTGGACGACATCGACCTGTCCGCCCTGCGGGACCC 62 

GGCGA C A C CCCGC AGCCTGG GACAT GACCTGTC CCCTGCGGGACCC 
NIK : 3 GGCGAACGACTCTCCCGC6AAGAGCCTGGTGGACATTGACCTGTCGTCCCTGCGGGACCC 62 

25 

N0V3a : 63 TGCTGGGATCTTTGAGCTTGTGGAGGTGGTCGGCAATGGAACCTACGGACAGGTGTACAA 
122 

TGCTGGGAT TTTGAGCT GTGGA GTGGT GG AATGG ACCTA GGACA GT TA AA 
NIK : 63 TGCTGGGATTTTTGAGCTGGTGGAAGTGGTTGGAAATGGCACCTATGGACAAGTCTATAA 
30 122 



35 



N0V3a: 
182 

NIK : 
179 



123 GGGTCGGCATGTCAAGACGGGGCAGCTGGCTGCCATCAAGGTCATGGATGTCACGGAGGA 

GGGTCG CATGT AA ACGG CA CTG C GCCATCAAGGT ATGGA GTCAC GAGGA 
123 GGGTCGACATGTTAAAACGGT-CA-CTGCC-GCCATCAAGGTTATGGACGTGACCGAGGA 



40 



N0V3a: 
242 

NIK : 
239 



183 CGAGGAGGAAGAGATCAAACAGGAGATCAACATGCTGAAAAAGTACTCTCACCACCGCAA 

GA GAGGAAGA ATCA AC GGAGAT AA ATGCTGAA AAGTA TCTCA CA CG AA 
180 TGAAGAGGAAGAAATCACACTGGAGATAAATATGCTGAAGAAGTATTCTCATCATCGAAA 
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N0V3a: 
302 

NIK : 
299 



243 CATCGCCACCTACTACGGAGCCTTCATCAAGAAGAGCCCCCCGGGAAACGATGACCAGCT 

AT GCCAC TACTA GG GC TTCAT AAGAAGAGCCC CO GGA A GATGACCA CT 
240 TATTGCCACGTACTATGGTGCTTTCATTAAGAAGAGCCCTCCAGGACATGATGACCAACT 
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N0V3a: 
362 

NIK ; 
5 359 

N0V3a: 
422 

10 NIK : 
419 

N0V3a: 
482 

15 

NIK : 
479 

N0V3a: 
20 542 

NIK : 
539 

25 N0V3a: 
601 

NIK : 
598 

30 

N0V3a: 
660 

35 NIK : 
657 

NOVSa: 
720 

40 

NIK : 
717 

N0V3a: 
45 .780 

NIK : 
777 

50 NOVSa: 
838 

NIK : 

835 

55 

N0V3a: 
898 

NIK : 
60 895 



303 CTGGCTGGTGATGGAGTTCTGTGGTGCTGGTTCAGTGACTGACCTGGTAAAGAACACAAA 

CTGGCT GT ATGGAGTT TGTGG GCTGG TC T AC GACCT GT AAGAACAC AA 
300 CTGGCTTGTTATGGAGTTTTGTGGGGCTGGGTCCATCACAGACCTTGTGAAGAACACCAA 



363 AGGCAACGCCCTGAAGGAGGACTGTATCGCCTATATCTGCAGGGAGATCCTCAGGGGTCT 

AGG AAC C CT AA GA GACTG AT GC TA ATCT CAGGGA ATCCTCAGGGG T 
360 AGGGAACACTCTCAAAGAAGACTGGATTGCTTACATCTCCAGGGAAATCCTCAGGGGATT 

423 GGCCCATCTCCATGCCCACAAGGTGATCCATCGAGACATCAAGGGGCyVGAATGTG^ 

GGC CATCTCCAT CAC A GT AT CA CGAGA ATCAAGGG CA AATGTGCTGCT 
420 GGCACATCTCCATATTCACCACGTTATTCACCGAGATATCAAGGGCCAAAATGTGCTGCT 

483 GACAGAGAATGCTGAGGTCAAGCTAGTGGATTTTGGGGTGAGTGCTCAGCTGGACCGCAC 

GAG GAGAATGCTGAGGT AA CT GT GATTTTGG GT AG GCTCAGCTGGAC G AC 
480 GACCGAGAATGCTGAGGTGAAACTTGTTGATTTTGGTGTAAGCGCTCAGCTGGACAGGAC 

543 CGTGGG-CAGACGGAACACTTTCATTGGGACTCCCTACTGGATGGCTCCAGAGGTCATCG 

GT GG C GA G AA AC TTCAT GG AC CCCTACTGGATGGCTCCAGAGGTCATCG 
540 GGTTGGACGGA-GAAATACGTTCATAGGCACACCCTACTGGATGGCTCCAGAGGTCATCG 



602 CCTGTGATGAGAACCCTGATGCCACCTATGATTACAGGAGTGATATTTGGTCTCTA-GGA 

CCTGTGATGAGAACCC GA GCCAC TA GA TACAG AGTGA T TGGTC CT GG 
599 CCTGTGATGAGAACCCAGACGCCACTTACGACTACAGAAGTGACCTCTGGTC-CTGTGGC 



661 ATCACAGCCATCGAGATGGCAGAGGGAGCCCCCCCTCTGTGTGACATGCACCCCATGCGA 

ATCACAGCCATCGAGATGGC GA GG G CCCCCCTCT TGTGACATGCA CC ATG GA 
658 ATCACAGCCATCGAGATGGCTGAAGGGGGCCCCCCTCTCTGTGACATGCATCCAATGAGA 



721 GCCCTCTTCCTCATTCCTCGGAACCCTCCGCCCAGGCTCAAGTCCAAGAAGTGGTCTAAG 

GC CT TT CTCAT CC G AACCCTCC CCCAGGCT AAGTC AA AA TGGTC AAG 
718 GCGCTGTTTCTCATCCCCAGAAACCCTCCTCCCAGGCTGAAGTCAAAAAAATGGTCAAAG 



781 AAGTTCATTGA-CTTCATTGACACATGTCTCATCT^GACTTACCTG-AGCCGCCCACCCA 

AA TT TT A CTT AT GA TGTCT T AAGA TTAC TG AGC GCCC C A 
778 A7UVTTT-TTCAGCTTTATAGAAGGCTGTCTGGTGAAGAATTACATGCAGCGGCCCTCT-A 



839 CGGAGCAGCTACTGAAGTTTCCCTTCATCCGGGACGAGCCCACGGAGCGGCAGGTCCGCA 

C GAGCA CT T AA CC TTCAT GGGA CAGCCCA GA GGCAGGT CG A 
836 CAGAGCAACTTTTAAAACACCCTTTCATAAGGGATCAGCCCAATGAAAGGCAGGTTCGAA 



N0V3a : 899 TCCAGCTTAAGGACCACATTGACCGATCCCGGAAGAAGCGGGGTGAGAAAGAGGAGACAG 
958 
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TCCAGCTTAAGGA CACAT GACCG CC G AAGAAG G GG GAGAAAGA GAGAC G 



NIK : 896 
955 



wivw^A \3rL^\^\D V3 A/\V3Aa^ ti \6fo liAUiAAAGA GAGAC G 

TCCAGCTTAAGGATCACATAGACCGGACCAGAAAGAAGAGAGGCGAGAAAGATGAGACGG 

5 N0V3a: 959 AATATGAGTACAGCGGCAGCGAGGAGGAAGATGAC-A-GC-CATGGAG--AGGAAGGAGAG 

A TA GAGTAC7VGCGG AGCGAGGAGGA GA GA AG C TG AG AGGA GGAGAG 

NIK : 956 AGTACGAGTACAGCGGGAGCGAGGAGGAGGAGGAGGAAGTGCCTG-AGCAGGAGGGAGAG 
1014 

10 

NOVSa : 1015 CCAAGCTCCATCATGAACGTGCCTGGAGAGTCGACTCTACGCCGGGAGTTTCTCCGGCTC 
1074 

CCAAG TCCATC T AA GTGCCTGGAGAGTC ACTCT CG CG GA TT CT G CT 

• NIK : 1015 CCAAGTTCCATCGTCAATGTGCCTGGAGAGTCAACTCTGCGACGTGATTTCCTGAGACTG 
15 1074 

1132^' "^^^^ *^G^GG^^AAATAAG-AGCAACTCAGAGGCTTTAAAACAG-CAGC^^ 

CAGCAGGA AA AAG AGC TC GAGGCT T AG CAGCAGC CTGCAG AGC 

1132 ' "^^"^^ ^^^^^^^^^^^AGCGG-TCTGAGGCTCTGCGG-AGACAGCAGCTTCTGCAGGAGC 

1192^' ^^'^^ ^^^^<^GCGAGACCCCGAGGCACACATC7VAACACCTGCTGCACCAG 

AGCAGC CG GA C GAGG A A A OA CTGCTG AG GGCAG GCG A 
1192 ' '^'^^'^ ^^^TCCGGGAGCAGGAGGAGTATAAGAGGCAGCTGCTGGCTGAGAGGCAGAAGCGGA 

30 1251^' '^^^^ ^^^G^G^^GGAGGAGCGGCGCCGCGTGGAGGAGCAACAGCGGCGGGAGCGGGA-G 

T GA AGCAGAA GA AG GG G CG TGGA GAGCAACA G G GA CGGGA G 
1193 TTGAACAGCAGAAAGAACAGAGGAGGCGGCTGGAAGAGCAACAAAGAAGAGAACGGGAAG 



NIK 
1252 



1307^' ^^^^ ^G<^G^GCTGCAGGAGAAGGAGCAGCAGCGGCG-G~CTGGAGGACATGCAGGC-TCT 

C GGA GC GCAGGAG GAGCAGC GCGGCG G C GAGGA A G AGGC TCT 
NIK : 1253 CCA-GGAGGCAGCAGGAGCGTGAGCAGCGGCGGCGTGAACAAGAGGAGAAG-AGGCGTCT 

40 

N0V3a : 1308 GCGGCGGGA—GGAGGAGCGGCGGCAGGCGGAGCGCGAGCAGGAATATATTCGTCACAGG 
1365 

CG GG A GGA GCGGCG A G GAG GAG AGGA A C A AGG 

NIK : 1311 -CGA-GGAACTGGAAAGGCGGCGTAAAGAAGAGGAAGAG-AGGAG-ACGGGCAGAAGAGG 
4j 1366 

N0V3a: 1366 CTA-6AGGAG-GAGCAGCGACAGCTCGAGATCCTTCAGCAACAGCTGCTCCAGGAACAGG 
1423 

A GAGGAG G G AG G AC GAG T C TCAG C GC GCT AGGA AG 
1423 ' '^^^'^ ^^GAGGAGAGTGGAGAGGGAACAGGAG-TACATCAGG—CGGCAGCTAGAGGAGGAGC 

N0V3a: 1424 CCCTGCTGCTGGA-ATACA—AGCGGAAGCAGCTGGAGGAGCAGCGGCA-GTCAGAACGT 
*)D 1479 

C GC CTGGA AT C AGC G AGC GCT AGGAGCAG G CA GT A C 
1424 AGCGGCACCTGGAGATCCTGCAGCAGCAGCTGCTCCAGGAGCAG-GCCATGTTACTGCAC 



NIK 
1482 



60 N0V3a: 1480 CTCCAGAGGCAGCTGCA-GCAGGAGCATGCCTACCTCAAGTCCCTGCAGCAGCAGCAACA 
1538 

CCA AGG GC GGA GCA AGCA GC CC C G CCC GCAGCAGCAG A CA 
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10 



15 



20 



NIK : 
1537 

N0V3a: 
1590 

NIK : 
1595 

NOVSa: 
1647 

NIK : 
1654 

NOVSa: 

NIK : 



1483 GACCACAGGAGGCCGCACGCAC-AGCA-GCAG-CCGCC-GCCCCCGCAGCAGCAGGA-CA 

1539 GCAGCAG~C-AGCTT-CA-GAAACAGCAGCAGCAGCAGCTCC-TG"CC-TGGGGACAGG 

G AGCA C AGCTT CA G CAG AGO AGC C C TG CC TG GACAG 
1538 GGAGCAAACCGAGCTTTCATGCTCCAG-AGCCC7VAGCCTCACTATGACCCTGCTGACAG- 

1591 AAGCCCCTGTACCATTATGGTCGGGGCATGAATCCCGCT-GA-CAAAC-CAGCCTGGGCC 

AGC C G A TGGTC C G ATC C C GA CAA C CC G C 
1596 -AGCTCGGGAGGTACAGTGGTCCCACCTGGCATCTCTCAAGAACAATGTCTCCCCTGTCT 



1648 CGAGA 1652 (SEQ ID NO: 62) 
CGAGA 

1655 CGAGA 1659 (SEQ ID NO: 29) 



TABLE 12 



25 



30 



35 



40 



45 



50 



55 



60 



Score *= 2104 bits (5451), Expect = 0.0 

Identities = 1095/1332 (82%), Positives = 1095/1332 (82%), Gaps = 37/1332 



MGDPAPARSIiDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQLAAIKVMDVTX 60 
MGDPAPARSLDDI DLSALRDPAGI FELVEWGNGT YGQVYKGRHVKTGQLAAIKVMDVT 

MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQLAAIKVMDVTE 60 

XXXXXIKQEINMLKKYSHHRNIAT YYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 120 

IKQEINMLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 

DEEEEIKQEINMLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 120 



KGNALKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 



TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 



ALFIilPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 



(2%) 




N0V3a: 


1 


NIK : 


1 


N0V3a: 


61 


NIK : 


61 


N0V3a; 


121 


NIK : 


121 


N0V3a: 


181 


NIK : 


181 


N0V3a: 


241 


NIK : 


241 


N0V3a: 


301 


NIK : 


301 


N0V3a: 


361 


NIK : 


361 


N0V3a: 


421 


NIK : 


421 


N0V3a: 


481 



QLKDHI 



ENKSNSEALK 



PSSIMNVPGESTLRREFLRLQQ 



RDPEAHIKHLLH 



DMQAL 



KRK 



HAYLKS 

49 



-RREEERRQAEREQEY 451 
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NIK : 452 KRKQLEEQRQSERLQRQLQQEHAYLKSLQQQQQQQQLQKQQQQQLLPGDRKPLra^ 511 

N0V3a: 541 NPADKPAWAREVEERTRMbmQQNSPLJ^SKPGSTXXXXXXXXXXXXXXXXXXXXX^ 600 

NPADKPAWAREVEERTRMNKQQNSPLAKSKPGST MQRP 
NIK : 512 NPADiCPAWAREVEERTRMNKQQNSPLAKSBCPGSTGPEPPIPQASPGPPGPLSQTPPMQRP 571 

N0V3a: 601 VEPQEGPHKSLVAHRVPLKPYAAPVPRSQSLQDQPTRNLAAFPASHXXXXXXXXXXXXXX 660 

VEPQEGPHKSLVAHRVPLKPYT^APVPRSQSLQDQPTRNLAAFPASH 
NIK : 572 VEPQEGPHKSLVAHRVPLKPYAAPVPRSQSLQDQPTRNLAAFPASHDPDPAIPAPTATPS 631 

N0V3a: 661 XRGAVIRQNSDPTSEGPGPSPNPPAWVRPDNEAPPKVPQRTSSIATALNTSGAGGSRPAQ 720 

RGAVIRQNSDPTSEGPGPSPNPPAWVRPDNEAPPKVPQRTSSIATALNTSGAGGSRPAQ 
NIK : 632 ARGAVIRQNSDPTSEGPGPSPNPPAWVRPDNEAPPKVPQRTSSIATALNTSGAGGSRPAQ 691 

15 N0V3a: 721 AVRASNPDLRRSDPGWERSDSVLPASHGHLPQAGSLERNRVGVSSKPDSSPVLSPGNKAK 780 
AVRASNPDLRRSDPGWERSDSVLPASHGHLPQAGSLERNRVGVSSKPDSSPVLSPGNKAK 
NIK : 692 AVRASNPDLRRSDPGWERSDSVLPASHGHLPQAGSLERNRVGVSSKPDSSPVLSPGNKAK 751 

N0V3a: 781 PDDHRSRPGRPASYKRAIGEDFVLLKERTLDEAPRPPKKAMDYXXXXXXXXXXXXXXXXX 840 
20 PDDHRSRPGRPA DFVLLKERTLDEAPRPPKKAMDY 

NIK : 752 PDDHRSRPGRPA DFVLLKERTLDEAPRPPKKAMDYSSSSEEVESSEDDEEEG 803 

N0V3a: 841 XXXXXXXXRDTPGGRSDGDTDSVSTMWHDVEEITGTQPPYGGGTMWQRTPEEERNLLH 900 

RDTPGGRSDGDTDSVSTMWHDVEEITGTQPPYGGGTMWQRTPEEERNLLH 
25 NIK : 804 EGGPAEGSRDTPGGRSDGDTDSVSTMWHDVEEITGTQPPYGGGTMWQRTPEEERNLLH 863 

N0V3a: 901 ADSNGYTNLPDWQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIY 960 

ADSNGYTNLPDWQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIY 
NIK : 864 ADSNGYTNLPDWQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIY 923 



30 
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N0V3a: 961 QPGGSGDSIPITALVGGEGTRLDQLQYDVRKGSWNVNPTNTRAHSETPEIRKYKKRFNS 
1020 

QPGGSGDSIPITALVGGEGTRLDQLQYDVRKGSWNVNPTNTEIAHSETPEIRKYKKRFNS 
NIK : 924 QPGGSGDSIPITALVGGEGTRLDQLQYDVRKGSWNVNPTNTRAHSETPEIRKYKKRFNS 983 



N0V3a: 1021 EILCAALWGVNLLVGTENGLMLLDRSGQGKVYGLIGRRRFQQMDVLEGLNIiLITISGKRN 
1080 

EILCAALWGVNLLVGTENGLMLLDRSGQGKVYGLIGRRRFQQMDVLEGLNLLITISGKRN 
NIK : 984 EILCAALWGWLLVGTENGLMLLDRSGQGKVYGLIGRRRFQQMDVLEGLNLLITISGKRN 
40 1043 

N0V3a: 1081 KLRVYYLSWLRNKILHNDPEVEKKQGWTTVGDMEGCGHYRWKYERIKFLVIALKSSVEV 
1140 

KLRVYYLSWLRNKILHNDPEVEKKQGWTTVGDMEGCGHYRWKYERIKFLVIALBCSSVEV 
45 NIK : 1044 KLRVYYLSWLRNKILHNDPEVEKKQGWTTVGDMEGCGHYRWKYERIKFLVIALKSSVEV 
1103 

N0V3a: 1141 YAWAPKPYHKEmFKSFADLPHRPLLVDLTVEEGQRLKVIYGSSAGFHAVDVDSGNSYDI 
1200 

50 YAWAPKPYHKFMAFKSFADLPHRPLLVDLTVEEGQRLKVIYGSSAGFHAVDVDSGNSYDI 
NIK : 1104 YAWAPKPYHKFMAFKSFADLPHRPLLVDLTVEEGQRLKVIYGSSAGFHAVDVDSGNSYDI 
1163 

N0V3a: 1201 YIPVHIQSQITPHAIIFLPNTDGMEMLLCYEDEGVYVNTYGRIIKDWLQWGEMPTSVAY 
55 1260 

YIPVHIQSQITPHAIIFLPNTDGMEMLLCYEDEGVYVNTYGRIIKDWLQWGEMPTSVAY 
NIK : 1164 YIPVHIQSQITPHAIIFLPNTDGMEMLLCYEDEGVYVNTYGRIIKDWLQWGEMPTSVAY 
1223 

60 N0V3a: 1261 ICSNQIMGWGEKAIEIRSVETGHLDGVFMHKRAQRLKFLCERNDKVFFASVRSGGSSQVY 
1320 

ICSNQIMGWGEKAIEIRSVETGHLDGVFMHKRAQRLKFLCERNDKVFFASVRSGGSSQVY 
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NIK : 1224 ICSNQIMGWGEKMEIRSVETGHLDGVEKHKRAQRLKFLCERNDKVFFASVRSGGSSQVY 
1283 

N0V3a: 1321 FMTLNRNCIMNW 1332 {SEQ ID NO: 63) 
5 ^TLNRNCIMNW 

NIK : 1284 FMTLNRNCIMNW 1295 (SEQ ID NO: 30) 

Based on its relatedness to known members of the STE20 family of protein kinases, 
NOVBa provides new diagnostic and ther^eutic compositioxis useful in the treatment of 
10 disorders associated with alterations in the expression of members of the STB20 family of 

protein kinases. Nucleic acids^ polypeptides, antibodies, and other compositions of the present 
invention are useM in the treatment and diagnosis of a variety of diseases and pathologies, 
including, by way 6f nonlimiting example, those involving metabolic and endocrine disorders, 
cancer, bone disorders, and tissue/cell growth regulation disorders. 

15 

NOV-3b 

A N0V-3b sequence according to the invention is a nucleic acid sequence encoding a 
polypeptide related to STE20 family of protein kinases. A disclosed NOV-3b nucleic add and 
its encoded polypeptide includes the sequences shown in Table 13. The disclosed nucleic acid 
20 (SEQ ID NO: 8) is 3912 nucleotides in length and contains an open reading frame (ORF) that 
begins with an ATG initiation codon at nucleotides 1-3 and ends with a TGA stop codon at 
nucleotides 3910-3912. The start and stop codons are shown in bold font The respective 
ORF encodes a 1303 amino acid polypeptide (SEQ ID NO; 9). 

25 TABLE 13 

ATGGGCGACCGAGCCCCCGCCCGCaGCCTGGACGACATCGACCTGTCCGCCCTGCGGGAC 
TGTGGAGGTGGTCGGCAATGGAACCTACGGACaGGTGTACAAGGGTCGGCATGTCAAGACGGGGCA^ 
AGGTCATGGATGTCACGGAGGACGAGGAGGAAGAGATCAAACAGGAGATCAACATGCTGAflAAAOT^ 
AAOVTCGCCACCTACTACGGAGCCTTCATCAAGAAGAGCCCCCCGGGAAACGATGACCaWKTrCTGG^ 
30 CTGTGGTGCTGGTTCAGTGACTGACCTGGTAAAGAACAaWVAAGGCAACGCCCTGAAGGAGGACTGT 
GCAGGGAGATCCTCaGGGGTCTGGCCCATCTCCATGCCCACAAGGTGATCCATCGAC^ 
CT6ACAGAGAATGCTGAGGTCAAGCTAGTGGATTTTGGGGTGAGTGCTCAGCTGGACCGCACCGTGGGC 
TTTGATTGGGACTCCCTACTGGATGGCTCCAGAGGTCATCGCCrGTGATGAGAACCCTGATGCCACC^ 
GTGATATTTGGTCTCTAGGAATCACAGCCATCGAGATGGCAGAGGGAGCCCCCCCTCTGTGTGACATGC^ 

35 

GCCCTCOTCCrCATTCCTCGGAACCCTCCGCCCAGGCTCAAGTCCAAGAAGTGGTCTAAGAAGTTCAT • 
CACATGTCTCATCAAGACTTACCTGAGCCGCCCACCCACGGAGCAGCTACTGAAGTTTCCCT 
CGGAGCGGCAGGTCCGCATCCAGCTTAAGGACCS^CATTGACCGATCCCGGAAGAAGCGGGGTGAGAZVAGAGG^ 
TATGAGTACAGCGGCAGCGAGGAGGAAGATGACAGCCATGGAGAGGAAGGAGAGCCAAGCTCCATC^^ 
AGAGTCGACTCTACGCCGGGAGTTTCTCCGGCTCCAGCAGGAAAATAAGAGCAACTCAGAGGCTT^ 
40 AGCTGCAGCAGCAGCAGCAGCGAGACCCCGAGGCACACATCAAACACCTGCTGCACCA^ 

CAGAAGGAGGAGCGGCGCCGCGTGGAGGAGCAACAGCGGCGGGAGCGGGAGCAGCGGAAGCTG^ 
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GCGGCGGCTGGAGGAGATGCAGGCTCTGCGGCGGGAGGAGGAGCGGCGGCAGGCGGAGCGCGA 

ACAGGCTAGAGGAGCAGO^GCAGTCaGAACGTCTCCAGRGGCa 

CAGCaGCAACAGCAGCAGCAGCOTCaVGAAACAGCAGCSU^ 

TGGTCGGGGCATGMTCCCGCTGACAAACCAGCCTGGGCCCGAGAGGTAGAAGMAGAAC^ 

ACTCTCCCTTGGCCaAGAGC^^GCCAGGCaVGCACG^ 

CCCCTTTCCCAGACTCCTCCTATGCAGAGGCCGGTGGAGCCCCAGGAGG^^ 

CCCACTGAAGCCyVTATGCAGCACCTGTACCCCGATCCCAGTCCCTGCAGGACCAGCCCAC^^ 

CAGCCTCCCATGACCCCGACCCTGCCATCCCCGCACCCaCTGCCACGCCCAGTGCCCGAGGAGCT 

TCAGACCCCACCTCTGAAGGACCTGGCCCCAGCCCGAATCCCCraGCCT^ 

GGTGCCTCAGAGGACCTCATCTATCGCCACTGCCCTTAACACCAGTGGGGCCGGAGGGTCCCGG^ 

GTGCCAGTAACCCCGACCTCAGGAGGAGCGACCCTGGCTGGGAACGCTCGGACAGCGTCCTTCm 

CTCCCCCAGGCTGGCTCACrGGAGCGGAACCGCGTGGGAGTCTCCTCCAAACCGG^ 

GAATAAAGCCaAGCCCGACGACCACCGCTCACGGCCAGGCCGGCCaSCAAGCTA 

TGTTGCTGAAAGAGCGGACTCTGGACGAGGCCCCTCGGCCTCCCAAGAAGGCCATGGACT 

GTGGAAAGCAGTGAGGACGACGAGGAGGAAGGCGAAGGCGGGC(^GCAGAGGGGAGCAGAGATACCCCT 

CGATGGGGATACAGACAGCGTCAGCACCaTGGTGGTCCACGACGTCGAGGAGATCACCGGGACCC^ 

GCGGCACCATGGTGGTCCAGCGCACCCCTGAAGAGGAGCGGAACCTGOTGCATGCT 

CCTGACGTGGTCCAGCCCAGCCACTCACCCACCGAGAACAGCAAAGGCCAAA^ 

CTACCAGTCTCGTGGGCTGGTAAAGGCCCCTGGCAAGAGCTCGTTCT^CGATGTTTGT^ 

GAGGCAGTGGGGACAGCATCCCCATCaCAGCCCTAGTGGGTGGAGAGGGCACrrCG^ 

AGGAAGGGTTCTGTGGTCAACGTGAATCCCACCAACACCCGGGCCCACAGTGAGACCCCTGAGATCCGG^ 

GCGATTCAACTCCGAGATCCTCTGTGCAGCCCTTTGGGGGGTCAACCTGCTGGTGGGC^ 

TGGACCGAAGTGGGCAGGGCAAGGTGTATGGACTCATTGGGCGGCGACGCTTCCSW^GATGGA 

AACCTGCTCATCACCATCTCAGGGAAAAGGAACAAACTGCGGGTGTATTACCTGTCCTGGCTCCGGA^ 

CAATGACCCAGAAGTGGAGAAGAAGCAGGGCTGGACCACCGTGGGGGACATGGAGGGCTGCGGGCACT 

AATACGAGCGGAOTAAGTTCCTGGTCaTCGCCCTCAAGAGCTCCGTGGAGGTGTATGCCTGGGCCCCCAAACCCTACC^ 

AAATTCATGGCCTTCAAGTCCTTTGCCGACCTCCCCCaCCGCCCTCTGCTGGTCGaCC 

GCTCAAGGTCATCTATGGCTCCaGTGCTGGCTTCCATGCTGTGGATGTCGACTCGGGGJ^ 

CTGTGCACATCCAGAGCCAGATCACGCCCCATGCCATCATCTTCCrrCCCCAACTiCC^ 

TACGAGGACX;A6GGTGTCTACGTCyyv»CGTACGGGCGCATCATTAAGGAT6TGGTGCTGa^ 

TTCTGTGGCCTACATCTGCTCCaU\CCAGATAATGGGCTGGGGTGAGAAAGCCATTGA 

ACCTCGACGGGGTCTTCATGCaVCaAACGAGCTa^GGCTCAAGTTCCTGTGTGAGCGG 

TOiGTCCGCTCTGGGGGCAGCAGCCAAGTTTACTTCATGACTCTGAACCGT^ (SEQ ID 

NO: 8) 

MGDPAPARSLDDIDIiSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQIJ^ 

NIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNTKGNALKEDCIAYICREILRGIJ^ 

LTENAEVBOiVDEXSVSAQLDRTOGRRNTFIGTPYWMAPKVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPL 

ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQTO^ 

YEYSGSEE^DDSHGEEGEPSSIMNVPGESTLRREFLRLQQENKSNSEAIJCCKK3QLQQQQQRDPEM 

QKEERRRVEEQQRREREQRKLQEKEQQRRLEDMQAIiRREEERRQAEREQEYIRHRLEEQRQSERL^^ 

QQQQQQQLQKQQQQQIiPGDRKPLYHYGRQ5NPADKPAWAREVEERTRMNKQQNSPLAKSKPGSTGP 

PLSQTPPMQRPVEPQEGPHKSLVAHRVPIJCPYAAPVPRSQSLQDQPTRNIJUVFPASHDPDPAIPAM 

SDPTSEGPGPSPNPPAWVRPDNEAPPKVPQRTSSIATAMTSGAGGSRPAQAVRASNPDIJmSDPGWERSD^^^ 

LPQAGSIJERNRVGVSSKPDSSPVLSPGNKZUCPDDHRSRPGRPASYKRAIGEDFVI^ 

VESSEDDEEEGEGGPAEGSRDTPGGRSDGDTDSVSTM\AraDVEEIT6TQPPYGGGTMVVQRTPEEERNI»LHADSN6Y 

52 



wo 01/62928 PCT/USOl/06151 

PDWQPSHSPTENSKGQSPPSEa5GSGDYQSRGLVKaPGKSSFTMF\n)LGIYQPGGSGDSIPITALV6GEGTRLDQW 

RKGSVVNVNPTNTRAHSETPEIRKYKKRETJSEILCAALWGVNLLVGTENGI^^ 

Nl^ITISGKRNKLRVYYLSWLRNKIMNDPEVEKKCKSWT^ 

KFMAFKSFADLPHRPLLVDLTVEEGQRLK\rrYGSSAGETIAVDVDS6NSYDIYlPraiQSQITPH^ 
5 YEDEGVYVNTYGRIIKDVVLQWGE^4PTSVAyICSNQIMGWGEKMBIRSVETGHLDGm^^ 
SVRSGGSSQVYEMTLNRNCIMNW (SEQ ID NO: 9) 

The disclosed NOV-Sb nucleic acid sequence has homology (75% identity) to a mouse 
mRNA for a NIK protein (NIK) (GeiiBank Accession No: MMU88984), as shown in Table 

10 14. NIK proteins are a subgioi^ of the STE20 family of protein kinases. As indicated by the 
'"Expecr value, the probability of this alignment occurring by chance alone is 3.3e-295, which 
is an incredibly low probability score. Moreover, the disclosed, encoded amino acid sequence 
has 1093 of 1303 amino add residues (83%) identical to a human NIK-related protein 
(GenBank Accession No: BAA90753), as shown m Table 15. As indicated by the **Expecf * 

15 value, the probability of this alignment occurring by chance alone is 0.0, the lowest probability 
score. 
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TABLE 14 


Score = 


3828 


Identities - 


Plus 




N0V3b: 


4 


NIK : 


3 


N0V3b: 


63 


122 




NIK : 


63 


122 




N0V3b: 


123 


182 




NIK : 


123 


179 




N0V3b: 


183 


242 




NIK : 


180 


239 




N0V3b: 


243 


302 




NIK : 


240 


299 





GGCGACCCAGCC-CCCGCCCGCAGCCTGGACGACATCGACCTGTCCGCCCTGCGGGACCC 62 
GGCGA C A C CCCGC AGCCTGG GACAT GACCTGTC CCCTGCGGGACCC 
GGCGAACGACTCTCCCGCGAAGAGCCTGGTGGACATTGACCTGTCGTCCCTGCGGGACCC 62 

63 TGCTGGGATCTTTGAGCTTGTGGAGGTGGTCGGCAATGGAACCTACGGACAGGTGTACAA 

TGCTGGGAT TTTGAGCT GTGGA GTGGT GG AATGG ACCTA GGACA GT TA AA 



123 GGGTCGGCATGTCAAGACGGGGCAGCTGGCTGCCATCAAGGTCATGGATGTCACGGAGGA 
GGGTCG CATGT AA ACGG CA CTG C GCCATCAAGGT ATGGA GTCAC GAGGA 



183 CGAGGAGGAAGAGATCAAACAGGAGATCAACATGCTGAAAAAGTACTCTCACCACCGCAA 



GA GAGGAAGA ATCA AC GGAGAT AA ATGCTGAA AAGTA TCTCA CA CG AA 
TGAAGAGGAAGAAATCACACTGGAGATAAATATGCTGAAGAAGTATTCTCATCATCGAAA 



243 CATCGCCACCTACTACGGAGCCTTCATCAAGAAGAGCCCCCCGGGAAACGATGACCAGCT 
AT GCCAC TACTA GG GO TTCAT AAGAAGAGCCC CO GGA A GATGACCA CT 
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10 



15 



20 



25 



30 



N0V3b: 
362 

NIK : 
359 

N0V3b: 
422 

NIK : 
419 

N0V3b: 
482 

NIK : 
479 

N0V3b: 
542 

NIK : 
539 

N0V3b: 
601 

NIK : 
598 



303 CTGGCTGGTGATGGAGTTCTGTGGTGCTGGTTCAGTGACTGACCTGGTAAAGAACACAAA 

CTGGCT GT ATGGAGTT TGTGG GCTGG TC T AC GACCT GT AAGAACAC AA 
300 CTGGCTTGTTATGGAGTTTTGTGGGGCTGGGTCCATCACAGACCTTGTGAAGAACACCAA 



3 63 AGGCAACGCCCTGAAGGAGGACTGTATCGCCTATATCTGCAGGGAGATCCTCAGGGGTCT 

AGG AAC C CT AA GA GACTG AT GC TA ATCT CAGGGA ATCCTCAGGGG T 
3 60 AGGGAACACTCTCAAAGAAGACTGGATTGCTTACATCTCCAGGGAAATCCTCAGGGGATT 

423 GGCCCATCTCCATGCCCACAAGGTGATCCATCGAGACATCAAGGGGCAGAATGTGCTGCT 

GGC CATCTCCAT CAC A GT AT CA CGAGA ATCAAGGG CA AATGTGCTGCT 
420 GGCACATCTCCATATTCACCACGTTATTCACCGAGATATCAAGGGCCAAAATGTGCTGCT 

483 GACAGAGAATGCTGAGGTCAAGCTAGTGGATTTTGGGGTGAGTGCTCAGCTGGACCGCAC 

GAG GAGAATGCTGAGGT AA CT GT GATTTTGG GT AG GCTCAGCTG6AC G AC 
480 GACCGAGAATGCTGAGGTGAAACTTGTTGATTTTGGTGTAAGCGCTCAGCTGGACAGGAC 

543 CGTGGG-CAGACGGAACACTTTCATTGGGACTCCCTACTGGATGGCTCCAGAGGTCATCG 

GT GG C GA G AA AC TTCAT GG AC CCCTACTGGATGGCTCCAGAGGTCATCG 
540 GGTTGGACGGA-GAAATACGTTCATAGGCACACCCTACTGGATGGCTCCAGAGGTCATCG 
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60 



N0V3b: 
660 

NIK : 
657 

N0V3b: 
720 

NIK : 
717 

N0V3b: 
780 

NIK ; 
111 

N0V3b: 
838 

NIK : 
835 

N0V3b: 
898 

NIK : 
895 

N0V3b: 
958 



602 CCTGTGATGAGAACCCTGATGCCACCTATGATTACAGGAGTGATATTTGGTCTCTA-GGA 

CCTGTGATGAG/^CCC GA GCCAC TA GA TACAG AGTGA T TGGTC CT GG 
599 CCTGTGATGAGAACCCAGACGCCACTTACGACTACAGAAGTGACCTCTGGTC-CTGTGGC 

661 ATCACAGCCATCGAGATGGCAGAGGGAGCCCCCCCTCTGTGTGACATGCACCCCATGCGA 

ATCACAGCCATCGAGATGGC GA GG G CCCCCCTCT TGTGACATGCA CC ATG GA 
658 ATCACAGCCATCGAGATGGCTGAAGGGGGCCCCCCTCTCTGTGACATGCATCCAATGAGA 

721 GCCCTCTTCCTCATTCCTCGGAACCCTCCGCCCAGGCTCAAGTCCAAGAAGTGGTCTAAG 

GC CT TT CTCAT CC G AACCCTCC CCCAGGCT AAGTC AA AA TGGTC AAG 
718 GCGCTGTTTCTCATCCCCAGAAACCCTCCTCCCAGGCTGAAGTCAAAAAAATGGTCAAAG 

781 AAGTTCATTGA-CTTCATTGACACATGTCTCATCAAGACTTACCTG-AGCCGCCCACCCA 

AA TT TT A CTT AT GA TGTCT T AAGA TTAC TG AGC GCCC C A 
778 AAATTT-TTCAGCrrTATAGAAGGCTGTCTGGTGAAGAATTACATGCAGCGGCCCTCr-A 

839 CGGAGCAGCTACTGAAGTTTCCCTTCATCCGGGACCAGCCGACGGAGCGGCAGGTCCGCA 

C GAGCA CT T AA CC TTCAX GGGA CAGCCCA GA GGCAGGT CG A 
836 CAGAGCAACTTTTAAAACACCCTTTCATAAGGGATCAGCCCAATGAAAGGCAGGTTCGAA 

899 TCCAGCTTAAGGACCACATTGACCGATCCCGGAAGAAGCGGGGTGAGAAAGAGGAGACAG 
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TCCAGCTTAAGGA CACAT GACCG CC G AAGAAG G GG GAGAAAGA GAGAC G 
896 TCCAGCTTAAGGATCACATAGACCGGACCAGAAAGAAGAGAGGCGAG7VAAGATGAGACGG 



10 



15 



N0V3b: 
1014 

NIK : 
1014 

N0V3b: 
1074 

NIK : 
1074 



959 AATATGAGTACAGCGGCAGCGAGGAGGAAGATGAC-A-GC-CATGGAG-AGGAAGGAGAG 

A TA GAGTACAGCGG AGCGAGGAGGA GA GA AG C TG AG AGGA GGAGAG 
95 6 AGTACGAGTACAGCGGGAGCGAGGAGGAGGAGGAGGAAGTGCCTG-AGCAGGAGGGAGAG 

1015 CCAAGCTCCATCATGAACGTGCCTGGAGAGTCGACTCTACGCCGGGAGTTTCTCCGGCTC 

CCAAG TCCATC T AA GTGCCTGGAGAGTC ACTCT CG CG GA TT CT G CT 
1015 CCAAGTTCCATCGTCAATGTGCCTGGAGAGTCAACTCTGCGACGTGATTTCCTGAGACTG 



20 



25 



N0V3b: 
1132 

NIK : 
1132 

N0V3b: 
1192 

NIK : 
1192 



1075 CAGCAGGAAAATAAG-AGCAACTCAGAGGCTTTAAAACAG-C7VGCAGCAGCTGCAGCAGC 

CAGCAGGA AA AAG AGO TC GAGGCT T AG CAGCAGC CTGCAG AGC 

1075 CAGCAGGAGAACAAGGAGCGG-TCTGAGGCTCTGCGG-AGACAGCAGCTTCTGCAGGAGC 

1133 AGCAGCAGCGAGACCCCGAGGCACACATCAAACACCTGCTGCACCAGCGGCAGCGGCGCA 

AGCAGC CG GA C GAGG A A A CA CTGCTG AG GGCAG GCG A 
1133 AGCAGCTCCGGGAGCAGGAGGAGTATAAGAGGCAGCTGCTGGCTGAGAGGCAGAAGCGGA 
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N0V3b: 
1251 

NIK : 
1252 

N0V3b: 
1307 

NIK : 
1310 

N0V3b: 
1365 

NIK : 
1366 



1193 TAGAGGAGCAGAAGGAGGAGCGGCGCCGCGTGGAGGAGCAACAGCGGCGGGAGCGGGA-G 

T GA AGCAGAA GA AG GG G CG TGGA GAGCAACA G G GA CGGGA G 
1193 TTGAACAGCAGAAAGAACAGAGGAGGCGGCTGGAAGAGCAACAAAGAAGAGAACGGGAAG 



1252 CAGCGGAAGC^GCAGGAGAAGGAGCAGCAGCGGCG-G—CTGGAGGACATGCAGGC-TCT 

C GGA GC GCAGGAG GAGCAGC GCGGCG G C GAGGA A G AGGC TCT 

1253 CCA-GGAGGCAGCAGGAGCGTGAGCAGCGGCGGCGTGAACAAGAGGAGAAG-AGGCGTCT 



1308 GCGGCGGGA—GGAGGAGCGGCGGCAGGCGGAGCGCGAGCAGGAATATATTCGTCACAGG 

CG GG A GGA GCGGCG A G GAG GAG AGGA A C A AGG 
1311 -CGA-GGAACTGGAAAGGCGGCGTAAAGAAGAGGAAGAG-AGGAG- ACGGGCAGAAGAGG 



50 



N0V3b: 
1418 

NIK : 
1424 



1366 CTA-GAGGAGCAGCGGC-AGT CAGAACGT-CTCCAGA-GGCAGCTGCAGGAGGAGCA 

A GAGGAG AG GG AG CAG A GT C CAG GGCAGCT AG AGGAGCA 

1367 AGAAGAGGAG-AGTGGAGAGGGAACAGGA-GTACATCAGGCGGCAGCTAGAGGAGGAGCA 
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N0V3b: 
1476 

NIK : 
1482 



1419 T-GCCTACCTCAAGTCCCTGCAGCAGCAGCAACAGCAGCAGCAGCTTCA-GAAACAGCAG 

G C ACCT AG CCTGCAGCAGCAGC C CAG AGCAG CA G AC GCA 
1425 GCGGC-ACCTGGAGATCCTGCAGCAGCAGCTGCTCCAGGAGCAGGC-CATGTTACTGCAC 



N0V3b: 1477 CAGCAGCAG 1485 {SEQ ID NO: 64) 
60 A CA CAG 

NIK : 1483 GACCA-CAG 1490 (SEQ ID NO: 31) 



55 



wo 01/62928 PCT/USOl/06151 
TABLE 15 

Score = 2114 bits (5478), Expect ==0.0 

Identities = 1093/1303 (83%), Positives = 1093/1303 (83%), Gaps = 8/1303 
(0%) 

5 

N0V3b: 1 MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQLAAIKVMDVTX 60 

MGDPAPARSLDDI DLSALRDPAGI FELVE WGNGTYGQVYKGRHVKTGQIiAAIKVMDVT 
NIK : 1 MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQLAAIKVMDVTE 60 

10 N0V3b: 61 XXXXXIKQEINMLKKYSHHRNIATYYGAFIECKSPPGNDDQLWLVMEFCGAGSV^ 120 

IKQEINMLKKYSHHRNIATYYGAFIECKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 
NIK : 61 DEEEEIKQEINMLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 120 



15 



35 



N0V3b: 121 KGNALKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 180 

KGNALKEDCIAYICREILRGIiAHLHAHKVlHRDIKGQN\rLLTENAEVBCLVDFGVSAQLDR 
NIK : 121 KGNALKEDCIAYICREILRGLMLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 180 

N0V3b: 181 TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 240 
TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 
20 NIK : 181 TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 240 

N0V3b: 241 ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 300 

ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 
NIK : 241 ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 300 

25 

N0V3b: 301 QLKDHIXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXPSSIMNVPGESTLRREFLRLQQ 360 

QLKDHI PSSIMNVPGESTLRREFLRLQQ 
NIK : 301 QLKDHIDRSRKKRGEECEETEYEYSGSEEEDDSHGEEGEPSSIMNVPGESTLRREFLRLQQ 360 

30 N0V3b: 361 ENKSNSEALKXXXXXXXXXXRDPEAHIKHLLHXXXXXXXXXXXXXXXXXXXXXXXXXXXX 420 
ENKSNSEALK RDPEAHIECHLLH 
NIK : 361 ENKSNSEALKQQQQLQQQQQRDPEAHIKHLLHQRQRRIEEQKEERRRVEEQQRREREQRK 420 

N0V3b: 421 XXXXXXXXXXXDMQALXXXXXXXXXXXXXXYIRHXXXXXXXXXXXXXXXXXXHAYLKS^ 480 

DMQAL Y R HAYLKS 

NIK : 421 LQEKEQQRRLEDMQALRREEERRQAEREQEYKRKQLEEQRQSERLQRQLQQEHAYLKSLQ 480 

N0V3b: 481 XXXXXXXXXXXXXXXXXPGDRKPLYHYGRGMNPADKPAWAREVEERTRMNKQQNSPLAKS 540 

PGDRKPLYHYGRGMNPADKPAWAREVEERTRMNKQQNSPLAKS 
40 NIK : 481 QQQQQQQLQKQQQQQLLPGDRKPLYHYGRGMNPADKPAWAREVEERTRMNKQQNSPLAKS 540 

N0V3b: 541 KPGSTXXXXXXXXXXXXXXXXXXXXXXMQRPVEPQEGPHKSLVAHRVPLKPYAAPVPRSQ 600 

KPGST MQRPVEPQEGPHKSLVAHRVPLKPYAAPVPRSQ 
NIK : 541 KPGSTGPEPPIPQASPGPPGPLSQTPPMQRPVEPQEGPHKSLVAHRVPLKPYAAPVPRSQ 600 

N0V3b: 601 SLQDQPTRNLAAFPASHXXXXXXXXXXXXXXXRGAVIRQNSDPTSEGPGPSPNPPAWVRP 660 

SLQDQPTRNLAAFPASH RGAVIRQNSDPTSEGPGPSPNPPAWVRP 
NIK : 601 SLQDQPTRNLAAFPASHDPDPAIPAPTATPSARGAVIRQNSDPTSEGPGPSPNPPAWVRP 660 

50 N0V3b: 661 DNEAPPECVPQRTSSIATALNTSGAGGSRPAQAVRASNPDLRRSDPGWERSDSVLPASHGH 720 
DNEAPPKVPQRTSSIATALNTSGAGGSRPAQAVRASNPDLRRSDPGWERSDSVLPASHGH 
NIK : 661 DNEAPPBCVPQRTSSIATALNTSGAGGSRPAQAVRASNPDLRRSDPGWERSDSVLPASHGfi 720 

N0V3b: 721 LPQAGSLERNRVGVSSKPDSSPVLSPGNKAKPDDHRSRPGRPASYKRAIGEDFVLLKERT 780 
55 LPQAGSIiERNRVGVSSKPDSSPVLSPGNKAKPDDHRSRPGRPA DFVLLKERT 

NIK ; 721 LPQAGSLERNRVGVSSKPDSSPVLSPGNKAKPDDHRSRPGRPA DFVLLKERT 772 

N0V3b: 781 LDEAPRPPKKAMDYXXXXXXXXXXXXXXXXXXXXXXXXXRDTPGGRSDGDTDSVSTMVVH 840 
LDEAPRPPKiCAMDY RDTPGGRSDGDTDSVSTMWH 
60 NIK : 773 LDEAPRPPKKAMDYSSSSEEVESSEDDEEEGEGGPAEGSRDTPGGRSDGDTDSVSTMWH 832 

N0V3b: 841 DVEEITGTQPPYGGGTMWQRTPEEERNLLHADSNGYTNLPDWQPSHSPTENSKGQSPP 900 
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DVEEITGTQPPYGGGTMWQRTPEKERNLLHMSNGYTNLPDWQPSHSPTENSKGQSPP 
NIK : 833 DVEEITGTQPPYGGGTMWQRTPEEERNLLHADSNGYTNLPDWQPSHSPTENSKGQSPP 892 

N0V3b: 901 SKDGSGDYQSRGLVKAPGK6SFTMFVDLGIYQPGGSGDSIPITALVGGEGTRLDQLQYDV 960 
5 SKDGSGDYQSRGLVKAPGKSSFTMFVDLGIYQPGGSGDSIPITALVGGEGTRLDQLQYDV 

NIK : 893 SKDGSGDYQSRGLVKAPGKSSFTMFVDLGIYQPGGSGDSIPITALVGGEGTRLDQLQYDV 952 

N0V3b: 961 IIKGSVVNWPTNTRAHSETPEIRKYKKRFNSEILCAALWGVNLLVGTENGLMLLDRSGQG 
1020 

10 RKGSVVNVNPTNTRAHSETPEIRKYKKRFNSEILCT^WGVNLLVGTENGLMLLD^ 

NIK : 953 RKGSVVNVNPTNTRAHSETPEIRKYKKRFNSEILCAALWGWLLVGTENGLMLLDRSGQG 
1012 

N0V3b: 1021 ECVYGLIGRRRFQQMDVLEGLNLLITISGKRNKLRVYYLSWLRNKILHNDPEVEKKQGWTT 
15 1080 

K\nfGLIGRRRFQQMDVLEGLNLLITISGKRNKLRVYYLSWLRNKILHNDPEVEKKQGWTT 
NIK : 1013 KVYGLIGRRRFQQMDVLEGLNLLITIS6KRNKLRVYYLSWLRNKILHNDPEVEKKQGWTT 
1072 

20 N0V3b: 1081 VGDMEGCGHYRWKYERIKFLVIALKSSVEVYAWAPKPYHKFMAFKSFADLPHRPLLVDL 
1140 

VGDMEGCGHYRWKYEEaKFLVIALKSSVEVYAWAPKPYHKFMAFKSFADLPHRPLLVDL 
NIK : 1073 VGDMEGCGHYRWKYERIKFLVIALKSSVEVYAWAPKPYHKFMAFKSFADLPHRPLLVDL 
1132 

25 

N0V3b: 1141 TVEEGQRLlKVIYGSSAGFHAVDVDSGNSYDIYIPVHIQSQITPHAIIFLPNTDGMEMLLC 
1200 

TVEEGQRLKVIYGSSAGFHAVDVDSGNSYDIYIPVHIQSQITPHAIIFLPNTDGMEMLLC 
NIK : 1133 TVEEGQRLKVIYGSSAGFHAVDVDSGNSYDIYIPVHIQSQITPHAIIFLPNTDGMEMLLC 
30 1192 

NOV3b: 1201 YEDEGVYVNTYGRIIKDWLQWGEMPTSVAYICSNQIMGWGEKAIEIRSVETGHLDGVFM 
1260 

YEDEGVYVNTYGRIIKDWLQWGEMPTSVAYICSNQIMGWGEKAIEIRSVETGHLDGVFM 
35 NIK : 1193 YEDEGVYVNTYGRIIKDWLQWGEMPTSVAYICSNQIMGWGEKAIEIRSVETGHLDGVFM 
1252 

NOV3b: 1261 HKRAQRLKFLCERNDKVFFASVRSGGSSQVYFMTLNRNGIMNW 1303 (SEQ ID NO: 65) 
HKRAQRLKFLCERNDKVFFASVRSGGSSQVYFMTLNRNCIMNW 
40 NIK : 1253 HKRAQRLKFLCERNDKVFFASVRSGGSSQVYFMTLNRNCIMNW 1295 (SEQ ID NO: 32) 

BsLsed on its relatedness to known members of the STE20 family of protein kinases, 
N0V3b provides new diagnostic and therapeutic compositions useful in tihie treatment of 
disorders associated with alterations in the expression of members of the STE20 family of 
45 protein kinases. Nucleic acids, polypeptides, antibodies, and other compositions of the present 
invention are useful in the treatment and diagnosis of a variety of diseases and pathologies, 
including, by way of nonlimiting example, those involving metabolic and endocrine disord^, 
cancer, bone disorders, and tissue/cell growth regulation disorders. 



50 NOV-3C 

AN0V-3C sequence according to the invention is a nucleic acid sequence encoding a 
polypeptide related to STE20 family of protein kinases. A disclosed N0V-3c nucleic acid and 
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its encoded polypeptide includes the sequences shown in Table 16. The disclosed nucleic acid 
(SEQ ID NO: 10) is 3822 nucleotides in length and contains an opm reading frame (ORF) that 
begins with an ATG initiation codon at nucleotides 1-3 and ends with a TGA stop codon at 
nucleotides 3820-3822. The start and stop codons are shown in bold font A respective ORF 
5 encodes a 1273 amino acid polypeptide (SEQ ID NO: 1 1). 

TABLE 16 

ATGGGCGACCCAGCCCCCGCCCGCaGCCTGGACGACATCGACCTGTCCGCCCTGCGGGACCCT 
TGTGGAGGTGGTCGGC/iATGGAACCTACGGACAGGTGTACaAGGGTCGGCATGTCMGACGGGGCArc 
10 AGGTCATGGATGTCACGGAGQACGAGGAGGAAGAGATCAAACAGGAGATCAACATGCTG^^ 
AACATCGCCACCTACTACGGAGCCTTC7VTCaAGAAGAGCCCCCCGGGAAACGATGAC(^^ 
CTGTGGTGCTGGTTCAGTGACn'GACCTGGTAAAGAACACAAAAGGCAACGCCCTGAAGG^ 
GCAGGGAGATCCTCAGGGGTCTGGCCCATCTCCATGCCCACAAGGTGATCCATCGAGACATC^ 
CTGACAGAGAATGCTGAGGTCyiAGCTAGTGGATTTTGGGGTGAGTGCrCAGCTGGACCGCaCCGTGGG 

15 

TTTCATTGGGACTCCCTACTGGATGGCTCCAGAGGT<3VTCGCCTGTGATGAGAACCCTGATGCCACCTATGATTACA 
GTGATATTTGGTCTCTAGGAATCAGAGCCaTCGAGATGGCaGaGGGaGCCCCCCCTCTGTGTGACATGCaC 
GCCCTCTTCCTCATTCCTCGGAACCCTCCGCCCAGGCTCAAGTCCAAGAAGTGGTCTAAGAAGTTCATTGACra 
avCATGTCTCATCAAGACTTACCTGAGCCGCCCACCCACGGAGCAGCTACTGAAGTTTCCCT - 
CGGAGCGGCAGGTCCGCATCCAGCTTAAGGACCACATTGACCGATCCCGGAAGAAGCGGGGTGAGAAAGAGGAm 

20 

TATGAGTACAGCGGCAGCGAGGAGG7\AGATGACAGCCATGGAGAGGAAGGAGAGCCAAGCrCCATCATG 

AGAGTCGACTCTACGCCGGGAGTTTCTCCGGCTCCAGCAGGAAAATAAGAGCAACTm 

AGCTGCAGCyiGCAGCAGCAGCGAGACCCCGAGGCACACaTCATVACACCTGCT 

CAGAAGGAGGAGCGGCGCCGCGTGGAGGAGCAACAGCGGCGGGAGCGGGAGCAGCGGAAGCTGCAGG^ 
GCGGCGGCTGGAGGACATGCaGGCTCTGCGGCGGGaGGAGGAGCGGCGGC^^ 

25 

ACAGGCTAGAGGAGGAGCAGCGAC7«3CTCGAGATCCTTCAGCAACAGCTGCrCCAGGM 
AAGCGGAAGCAGCTGGAGGAGCAGCGGCAGTCAGAACGTCTCCAGAGGCAGCTGCAG^ 
CCTGOVGCAGCAGCAACAGCAGCAGCAGCTTCAGAAACAGCAGCAGC?^^ 
ACCATTATGGTCGGGGCATGAATCCCGCTGACAAACCaGCCrGGGCCCGAGAGGTAGTGGCA^ 
CCATATGCAGCACCTGTACCCCGATCCCaGTCCCTGCAGGACCAGCCCACCCGAAACCTGGCTGCCTTCC 
30 TGACCCCGACCCTGCCATCCCCGCaCCCACTGCCACGCCCAGTGCCCGAGGAGCTGTCATCCGCCa^^ 

CCTCTGAAGGACCTGGCCCCAGCCCGAATCCCCCAGCCTGGGTCCGCCCAGATAACGAGGCCCCACCCAAGGTG^ 
AGGACCTCyiTCTATCGCCACTGCCCTTAACACGAGTGGGGCCGGAGGGTCCCGGCCAGCCCaGGa^ 

ccccgacctcaggaggagcgaccctggctgggaacgctcggacagcgtccttccagcctctcacgggcac^^ 
ctggctcactggagcggaaccgcgtgggagtctcctccaaaccggacagctccccpgtgctctcccctgggaat;^^ 
35 aagcccgacgacgaccgctcacggccaggccggcccgcaagctataagcgagc^ 

agagcggactctggacgaggcccctcggcctcccaagaaggccatggactactcgtcgtccagcgaggaggtggaaag^ 
gtgaggacgacgaggaggaaggcgaaggcgggccagcagaggggagcagagatacccctgggggccgctvgcg^ 
acagacagcgtcagcaccatggtggtccacgacgtcgaggagatcaccgggaccaigcccccatac 
ggtggtccagcgcacccctgaagaggagcggaacctgctgcatgctgacagcaatgg^ 

40 

TCCAGCCCAGCCACTCACCCACCGAGAACAGCAAAGGCCAAAGCCCavCCCTCGAAGGATGGGAGTGGT^^ 
CGTGGGCTGGTAAAGGCCCCTGGCAAGAGCTCGTTCAaSATGTTTGTGGATCTAGGGATCPACC^^ 
GGACAGCATCCCCATCACAGCCCTAGTGGGTGGAGAGGGCACTGGGCTCGACCAGCTGCAGTACGACGTGAGG 
CTGTGGTCAACGTGAATCCCACCAACACCCGGGCCCACAGTGAGACCCCPGA6ATCCGGAAGTA 

TCCGAGATCCTCTGTGCAGCCCTTTGGGGGGTCAACCTGCTGGTGGGCaCGGAGAACGGGCTGATGTTGCTGGACCGAAG 
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TGGGCAGGGCyUiGGTGTATGGACTCATTGGGCGGCGACGCrTCCAGCAGATGGATGTGCT 
TCACCATCTCAGGGAAAAGGAAaVAACTGCGGGTGTATTACCTGTCCPGGCT 
GAAGTGGAGAAGAAGCAGGGCTGGACCACCGTGGGGGACATGGA6GGCTGCGGGCACT 
GATTAAGTTCCTGGTCATCGCCCTGAAGAGCTCCGTGGAGGTGTATGCCTGGGCCCCCAAACCCTAC 
5 CCTTCiy\GTCCTTTGCCGACCTCCCCCACCGCCCTCTGCTGGTCG^^ 

ATCTATGGCTCCAGTGCTGGCTTCCATGCTGTGGATGTCGACTCGGGGAACTUSCTATGAmTC^ 
CCAGAGCCAGATCACGCCCCATGCCATCATCTTCCTCCCCyU^CACCGACGGCATGGAGATGCTGCT 
AGGGTGTCTACGTCAACACGTACGGGCGCATCATTAAGGATGTGGTGCTGCAGTGGGGGGAG^ 
TACakTCTGCTCCAACCAGATAATGGGCTGGGGTGAGAAAGCCATTGAGATCCGCTCTGTGGAGACGGGCCA 
1 0 GGTCTT»TGCACAAACGAGCTCAGAGGCTCAAGTTCCTGTGTGAGCGGAATG^ 

CTGGGGGCAGCAGCCAAGTTTACTTGATGACTCTGAACCGTAACTGCATmT (SEQ ID NO: 10) 



MGDPAPARSLDDIDLSALRDPAGIPELVEWGNGTYGQVYKGRHVKTGQLAAIKVMDVTEDEKEEIKQ^ 
NIATYYGAFIKKSPPGWDDQLWLVMEFCGAGSVTDLVKNTKGNALKEDCIAYIC^^EII^^ 
15 LTENAEVKLVDFGVSAQLDRTOGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDbfflPm 
ALFLIPRNPPPIU^SKBCWSKKFIDFIDTCLIKTYLSRPPTEQLIJCFPFIRDQPTERQVRIQLKDHIDRSRI^ 
YEYSGSEEEDDSHGEEGEPSSIMNVPGESTIJIREFIJUjQQENKSNSEALKQQQQIKJQQQQRDPEA^ 
QKEERRRVEEQQRREREQRKLQEKEQQRRLEDMQALRREEERRQAEREQEYIRHRLEEEQRQL^ 
KRKQLEEQRQSERLQRQLQQEHAYLKSLCKXKXKXK^LQKQQQQQLL^ 

20 pyaapvprsqslqdqptrniaafpashdpdpaipaptatpsargavirqjisdptsegpgpspnppawvrpdneap 

RTSSIATALNTSGAGGSRPAQAVRASNPDIiRRSDPGWERSDSVLPASHGHLPQAGSLERNRVGVSSBCPDSSPV^ 
KPDDHRSRPGRPASYKRAIGEDFVLLKERTLDEAPRPPKKAMDYSS3SEEVESSEDDEEEGEGGPAEGSEU>T 

tdsvstmvvhdveeitgtqppygggtmwqrtpeeernijjiadsngytnlpdwqpshsptenskgqsppsk^ 
rglvkapgkssftmftolgiyqpggsgdsipitalvggegtrldqlqydvrkgsvvnvnptntrm 

25 SEILCAALWGWLLVGTENGIiMLLDRSGQGKVYGLIGRRRFQQMDVI£GI»NLLITISGKRNKIiRVY^ 
EVEKKQGWTTVGDMEGCGHYRVVKYERIKFLVIALKSSVEVYAWAPKPYHKFMAFKSFADLPHRPL^^ 
IYGS9AGFHAVDVDSGNSYDIYIPVHIQSQITPHAlIFLPNTD<aiEtnJ.CYEDEGV^ 

YICSNQIt5GWGEKAIEIRSVETGHLDGVmiKRAQRIJCFIiCERNDK\^FASVRSGGSSQ (SEQ 

ID NO: 11) 

30 

The disclosed N0V-3c nucleic acid sequence has homology (72% identity) to a mouse 
mRNA for a NIK protein (NIK) (GenBank Accession No: MMU88984), as shown in Table 
17. NIK proteins are a subgroup of the STB20 family of protein kinases. As indicated by the 
**E)qpecf' value, the probability of this alignment occurring by chance alone is 9.1e-299. 

35 Moreover, the disclosed, encoded amino acid sequence has 1048 of 1332 amino acid residues 
(78%) identical to a human NIK-related protein (GenBank Accession No: BAA90753), shown 
in Table 18, Furthermore, the encoded amino add sequence also has homology (79% identity) 
to a human GCK kinase (GenBank Accession No: BAA94838), anotiier subgroup of the 
STE20 kinase family, as shown in Table 19, As mdicated by the **Expect" value, the 

40 probability of these amino acid alignments occurring by chance alone are both 0.0, the lowest 
probability score. 
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TABLE 17 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



Score = 


3907 


Identities = 


Plus 




N0V3c: 


4 


NIK : 


3 


N0V3c : 


63 


122 




NIK : 


63 


122 




N0V3c: 


123 


182 




NIK : 


123 


179 




N0V3c: 


183 


242 




NIK : 


180 


239 




N0V3c: 


243 


302 




NIK : 


240 


299 




N0V3c: 


303 


362 




NIK : 


300 


359 




N0V3c : 


363 


422 




NIK : 


360 


419 




N0V3c: 


423 


482 




NIK : 


420 


479 




NOV3c: 


483 


542 




NIK : 


480 


539 




N0V3c: 


543 


601 





GGCGACCCAGCC-CCCGC.CCGCAGCCTGGACGACATCGACCTGTCCGCCCTGCGGGACCC 62 
GGCGA C A C CCCGC AGCCTGG GACAT GACCTGTC CCCTGCGGGACCC 
GGCGAACGACTCTCCCGCGAAGAGCCTGGTGGACATTGACCTGTCGTCCCTGCGGGACCC 62 

TGCTGGGATCTTTGAGCTTGTGGAGGTGGTCGGCAATGGAACCTACGGACAGGTGTACAA 

TGCTGGGAT TTTGAGCT GTGGA GTGGT GG AATGG ACCTA GGACA GT TA AA 



123 GGGTCGGCATGTCAAGACGGGGCAGCTGGCTGCCATCAAGGTCATGGATGTCACGGAGGA 
GGGTCG CATGT AA ACGG CA CTG C GCCATCAAGGT ATGGA GTCAC GAGGA 



183 CGAGGAGGAAGAGATCAAACAGGAGATCAACATGCTGAAAAAGTACTCTCACCACCGCAA 
GA GAGGAAGA ATCA AC GGAGAT AA ATGCTGAA AAGTA TCTCA CA CG AA 



243 CATCGCCACCTACTACGGAGCCTTCATCAAGAAGAGCCCCCCGGGAAACGATGACCAGCT 
AT GCCAC TACTA GG GC TTCAT AAGAAGAGCCC CC GGA A GATGACCA CT 



303 CTGGCTGGTGATGGAGTTCTGTGGTGCTGGTTCAGTGACTGACCTGGTAAAGAACACAAA 



CTGGCT GT ATGGAGTT TGTGG GCTGG TC T AC GACCT GT AAGAACAC AA 
CTGGCTTGTTATGGAGTTTTGTGGGGCTGGGTCCATCACAGACCTTGTGAAGAACACCAA 



3 63 AGGCAACGCCCTGAAGGAGGACTGTATCGCCTATATCTGCAGGGAGATCCTCAGGGGTCT 
AGG AAC C CT AA GA GACTG AT GC TA ATCT CAGGGA ATCCTCAGGGG T 



423 GGCCCATCTCCATGCCCACAAGGTGATCCATCGAGACATCAAGGGGCAGAATGTGCTGCT 
GGC CATCTCCAT CAC A GT AT CA CGAGA ATCAAGGG CA AATGTGCTGCT 



483 GACAGAGAATGCTGAGGTCAAGCTAGTGGATTTTGGGGTGAGTGCTGAGCTGGACCGCAC 
GAC GAGAATGCTGAGGT AA CT GT GATTTTGG GT AG GCTCAGCTGGAC G AC 



543 CGTGGG-CAGACGGAACACTTTCATTGGGACTCCCTACTGGATGGCTCCAGAGGTCATCG 
GT GG C GA G AA AC TTCAT GG AC CCCTACTGGATGGCTCCAGAGGTCATCG 
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NIK : 
598 

N0V3c: 
5 660 

NIK : 
657 

10 N0V3c: 
720 

NIK : 
717 

15 

N0V3c: 
780 

NIK : 
20 777 

NOVSc: 
838 

25 NIK : 
835 

N0V3c: 
898 

30 

NIK : 
895 

N0V3c: 
35 958 

NIK : 
955 

40 N0V3c: 
1014 

NIK : 
1014 

45 

N0V3c: 
1074 

NIK : 
50 1074 

N0V3c: 
1132 

55 NIK : 
1132 

N0V3c: 
1192 

60 

NIK : 
1192 



5 40 GGTTGGACGGA-GAAATACGTTCATAGGCACACCCTACTGGATGGCTCCAGAGGTCATCG 



602 CCTGTGATGAGAACCCTGATGCCACCTATGATTACAGGAGTGATATTTGGTCTCTA-GGA 

CCTGTGATGAGAACCC GA GCCAC TA GA TACAG AGTGA T TGGTC CT GG 
■ 5 9 9 CCTGTGATGAGAACCCAGACGCCACTTACGACTACAGAAGTGACCTCTGGTC-CTGTGGC 

661 ATCACAGCCATCGAGATGGCAGAGGGAGCCCCCCCTCTGTGTGACATGCACCCCATGCGA 

ATCACAGCCATCGAGATGGC GA GG G CCCCCCTCT TGTGACATGCA CC ATG GA 
658 ATCACAGCCATCGAGATGGCTGAAGGGGGCCCCCCTCTCTGTGACATGCATCCAATGAGA 



721 GCCCTCTTCCTCATTCCTCGGAACCCTCCGCCCAGGCTCAAGTCCAAGAAGTGGTCTAAG 

GC CT TT CTCAT CC G AACCCTCC CCCAGGCT AAGTC AA AA TGGTC AAG 
718 GCGCTGTTTCTCATCCCCAGAAACCCTCCTCCCAGGCTGAAGTCAAAAAAATGGTCAAAG 



781 AAGTTCATTGA-CTTCATTGACACATGTCTCATCAAGACTTACCTG-AGCCGCCCACCCA 

AA TT TT A CTT AT GA TGTCT T AAGA TTAC TG AGC GCCC C A 
778 AAATTT-TTCAGCTTTATAGAAGGCTGTCTGGTGAAGAATTACATGCAGCGGCCCTCT-A 



839 CGGAGCAGCTACTGAAGTTTCCCTTCATCCGGGACCAGCCCACGGAGCGGCAGGTCCGCA 

C GAGCA CT T AA CC TTCAT GGGA CAGCCCA GA GGCAGGT CG A 
836 CAGAGCAACTTTTAAAACACCCTTTCATAAGGGATCAGCCCAATGAAAGGCAGGTTCGAA 



8 99 TCCAGCTTAAGGACCACATTGACCGATCCCGGAAGAAGCGGGGTGAGAAAGAGGAGACAG 

TCCAGCTTAAGGA CACAT GACCG CC G AAGAAG G GG GAGAAAGA GAGAC G 
896 TCCAGCTTAAGGATCACATAGACCGGACCAGAAAGAAGAGAGGCGAGAAAGATGAGACGG 



959 AATATGAGTACAGCGGCAGCGAGGAGGAAGATGAC-A-GC-CATGGAG-AGGAAGGAGAG 

A TA GAGTACAGCGG AGCGAGGAGGA GA GA AG C TG AG AGGA GGAGAG 
956 AGTACGAGTACAGCGGGAGCGAGGAGGAGGAGGAGGAAGTGCCTG-AGCAGGAGGGAGAG 

1015 CCAAGCTCCATCATGAACGTGCCTGGAGAGTCGACTCTACGCCGGGAGTTTCTCCGGCTC 

CCAAG TCCATC T AA GTGCCTGGAGAGTC ACTCT CG CG GA TT CT G CT 
1015 CCAAGTTCCATCGTCAATGTGCCTGGAGAGTCAACTCTGCGACGTGATTTCCTGAGACTG 

1075 CAGCAGGAAAATAAG-AGCAACTCAGAGGCTTTAAAACAG-CAGCAGCAGCTGCAGCAGC 

CAGCAGGA AA AAG AGC TC GAGGCT T AG CAGCAGC CTGCAG AGC 

1075 CAGCAGGAGAACAAGGAGCGG-TCTGAGGCTCTGCGG-AGACAGCAGCTTCTGCAGGAGC 



1133 AGCAGCAGCGAGACCCCGAGGCACACATCAAACACCTGCTGCACCAGCGGCAGCGGCGCA 

AGCAGC CG GA C GAGG AAA CA CTGCTG AG GGCAG GCG A 
1133 AGCAGCTCCGGGAGCAGGAGGAGTATAAGAGGCAGCTGCTGGCTGAGAGGCAGAAGCGGA 
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1193 TAGAGGAGCAGAAGGAGGAGCGGCGCCGCGTGGAGGAGCAACAGCGGCGGGAGCGGGA-G 

T GA AGCAGAA GA AG GG G CG TGGA GAGCAACA G G GA CGGGA G 
1193 TTGAACAGCAGAAAGAACAGAGGAGGCGGCTGGAAGAGCAACAAAGAAGAGAACGGGAAG 



N0V3c: 
1251 

NIK : 
1252 

N0V3c: 
1307 

NIK : 
1310 



1252 CAGCGGAAGCTGCAGGAGAAGGAGCAGCAGCGGCG-G~CTGGAGGACATGCAGGC-TCT 

C GGA GC GCAGGAG GAGCAGC GCGGCG G C GAGGA A G AGGC TCT 

1253 CCA-GGAGGCAGCAGGAGCGTGAGCAGCGGCGGCGTGAACAAGAGGAGAAG-AGGCGTCT 



15 



N0V3c: 
1365 

NIK : 
1366 



1308 GCGGCGGGA—GGAGGAGCGGCGGCAGGCGGAGCGCGAGCAGGAATATATTCGTCACAGG 

CG GG A GGA GCGGCG A G GAG GAG AGGA A C A AGG 
1311 -CSA-GGAACTGGTVAAGGCGGCGTAAAGAAGAGGAAGAG-AGGAG-ACGGGCAGAAGAGG 



20 



25 



30 



35 



N0V3c: 
1423 

NIK : 
1423 

N0V3c: 
1479 

NIK : 
1482 

N0V3c: 
1538 

NIK : 
1537 



1366 CTA-GAGGAG-GAGCAGCGACAGCTCGAGATCCTTCAGCAACAGCTGCTCCAGGAACAGG 

A GAGGAG G G AG G AC GAG T C TCAG C GC GCT AGGA AG 

1367 AGAAGAGGAGAGTGGAGAGGGAACAGGAG-TACATCAGG — CGGCAGCTAGAGGAGGAGC 



1424 CCCTGCTGCTGGA-ATACA- -AGCGGAAGCAGCa?GGAGGAGCAGCGGCA-GTCAGAACGT 

C GC CTGGA AT C AGC G AGC GCT AGGAGCAG G CA GT A C 
1424 AGCGGCACCTGGAGATCCTGCAGCAGCAGCTGCTCCAGGAGCAG-GCCATGTTACTGCAC 

1480 CTCCAGAGGCAGCTGCA-GCAGGAGCATGCCTACCTCAAGTCCCTGCAGCAGCAGCAACA 

CCA AGG GC GCA GCA AGGA GC CC C G CCC GCAGCAGCAG A CA 
1483 GACCACAGGAGGCCGCACGGAC-AGCA-GCAG-CCGCC-GCCCCCGCAGCAGCAGGA-CA 



40 



N0V3c: 
1590 

NIK : 
1595 



1539 GCAGCAG — C-AGCTT-CA-GAAACAGCAGCAGCAGCAGCTCC-TG-CC-TGGGGACAGG 

G AGCA C AGCTT CA G CAG AGC AGC C C TG CC TG GACAG 
1538 GGAGCAAACCGAGCTTTCATGCTCCAG-AGCCCAAGCCTCACTATGACCCTGCTGACAG- 



45 



50 



55 



60 



N0V3c: 
1647 

NIK : 
1654 

N0V3c: 
1704 

NIK : 
1710 

N0V3c: 
1764 

NIK : 
1764 

N0V3c: 

NIK : 



1591 AAGCCCCTGTACCATTATGGTCGGGGCATGAATCCCGCT-GA-CAAAC-CAGCCTGGGCC 

AGC C G A TGGTC C G ATC C C GA CAA C CC G C 
1596 -AGCTCGGGAGGTACAGTGGTCCCACCTGGCATCTCTCAAGAACAATGTCTCCCCTGTCT 



1648 CGAGAGGTAGTGGCACACCGGGTCCCACTGAAGCCATAT — GGAGCACCTGTACC-CCGA 

CGAGA T C C G G CCC T CCA AT GCA CACC A C CCG 

1655 CGAGATCCCATTCCTT-CAGTGACCCT-TCTC-CCAAATTCGCA-CACCACCATCTCCGC 

17 05 TCCCAGTCCCTGCAGGACCAGCCCACCCGAAACCTGGCTGCCTTCCCAGCCTCCCATGAC 



TC CAG CC 
1711 TCTCAGGACC- 



CA G CCA CC CCCG A GG GC CAG C C TGAC 

-CATGTCCA-CCTTCCCGCAGTGAGGG-GCTCAGTCAGAG-CTC-TGAC 



1765 CCCGACCCTGCCATCCCCGCACCCAC 1790 (SEQ ID NO: 66) 

C A C G T CCCG CCCAC 
1765 TCTAAGTCGGAGGTGCCCGAGCCCAC 1790 (SEQ ID NO: 33) 

62 
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TABLE 18 

Score = 1985 bits (5143), Expect =0.0 

Identities = 1048/1332 (78%), Positives = 1051/1332 (78%), Gaps = 96/1332 
(7%) 

N0V3c: 1 MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQIJVAIKVMbVTX 60 

MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQLAAIKVMDVT 
NIK : 1 MGDP7VPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQLAAIBCVMDVTE 60 

N0V3c: 61 XXXXXIKQEINMLKKYSHHRNIATYYGAFIKBCSPPGNDDQLWLVMEFCGAGSVTDLVKNT 120 

IKQEINMLKKySHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 
NIK : 61 DEEEEIKQEINMLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 120 

15 N0V3c: 121 KGNALKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 180 
KGNALKEDCIAYICREILRGLAHIiHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 
NIK : 121 KGNAIiKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 180 

N0V3c: 181 TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 240 
20 TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 

NIK : 181 TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 240 

N0V3c; 241 ALFLIPRNPPPRLKSKKWSJCKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 300 
ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 
25 NIK : 241 ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 300 



30 



N0V3c: 301 QLBCDHIXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXPSSIMNVPGESTLRREFLRLQQ 360 

QLKDHI PSSIMNVPGESTLRREFLRLQQ 
NIK : 301 QLKDHIDRSRKKRGEKEETEYEYSGSEEEDDSHGEEGEPSSIMNVPGESTLRREFLRLQQ 360 

N0V3c: 361 ENKSNSEALKXXXXXXXXXXRDPEAHIKHLLHXXXXXXXXXXXXXXXXXXXXXXXXXXXX 420 

ENKSNSEALK RDPEAHIKHLLH 
NIK : 361 ENKSNSEALKQQQQLQQQQQRDPEAHIKHLLHQRQRRIEEQKEERRRVEEQQRREREQRK 420 

35 N0V3c: 421 XXXXXXXXXXXDMQALXXXXXXXXXXXXXXYIRHRXXXXXXXXXXXXXXXXXXXXXXXXY 480 

DMQAL Y 
NIK : 421 LQEKEQQRRLEDMQAL RREEERRQAEREQEY 451 

N0V3c: 481 KRKXXXXXXXXXXXXXXXXXXHAYLKSXXXXXXXXXXXXXXXXXXXPGDRKPLYHYGRGM 540 
40 KRK HAYLKS PGDRKPLYHYGRGM 

NIK : 452 KRKQLEEQRQSERLQRQLQQEHAYLKSLQQQQQQQQLQKQQQQQLLPGDRKPLYHYGRGM 511 

N0V3c: 541 NPADKPAWAREWAH RVP 558 

NPADKPAWAREV + P 

45 NIK : 512 NPADKPAWAREVEERTRMNKQQNSPLAKSKPGSTGPEPPIPQASPGPPGPLSQTPPMQRP 571 



50 



N0V3c: 559 LKPYAAP VPRSQSLQDQPTRNLAAFPASHXXXXXXXXXXXXXX 601 

++P P VPRSQSLQDQPTRNLAAFPASH 
NIK : 572 VEPQEGPHKSLVAHRVPLKPYAAPVPRSQSLQDQPTRNLAAFPASHDPDPAIPAPTATPS 631 

N0V3c: 602 XRGAVIRQNSDPTSEGPGPSPNPPAWVRPDNEAPPKVPQRTSSIATALNTSGAGGSRPAQ 661 

RGAVIRQNSDPTSEGPGPSPNPPAWVRPDNEAPPKVPQRTSSIATALNTSGAGGSRPAQ 
NIK : 632 ARGAVIRQNSDPTSEGPGPSPNPPAWVRPDNEAPPBCVPQRTSSIATALNTSGAGGSRPAQ 691 

55 N0V3c: 662 AVRASNPDLRRSDPGWERSDSVLPASHGHLPQAGSLERNRVGVSSKPDSSPVLSPGNKAK 721 
AVRASNPDLRRSDPGWERSDSVLPASHGHLPQAGSLERNRVGVSSKPDSSPVLSPGNKAK 
NIK : 692 AVRASNPDLRRSDPGWERSDSVLPASHGHLPQAGSLERNRVGVSSKPDSSPVLSPGNKAK 751* 

N0V3c: 722 PDDHRSRPGRPASYKRAIGEDFVLLKERTLDEAPRPPKKAMDYXXXXXXXXXXXXXXXXX 781 
60 PDDHRSRPGRPA DF7LLKERTLDEAPRPPKKAMDY 

NIK : 752 PDDHRSRPGRPA DFVLLKERTLDEAPRPPKKAMDYSSSSEEVESSEDDEEEG 803 

63 
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N0V3c: 782 XXXXXXXXRDTPGGRSDGDTDSVSTMWHDVEEITGTQPPYGGGTMWQRTPEEERNLLH 841 

ia)TPGGRSDGDTDSVSTMWHDVEEITGTQPPYGGGTMWQRTPEEERNLLH 
NIK : 804 EGGPAEGSRDTPGGE^DGDTDSVSTMWHDVEEITGTQPPYGGGTMWQRTPEEERNLLH 863 

5 

NOVBc: 842 ADSNGYTNLPDWQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIY 901 

ADSNGYTNLPDWQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIY 
NIK : 864 7VDSNGYTNLPDWQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIY 923 

10 N0V3c: 902 QPGGSGDSIPITALVGGEGTRLDQLQYDVRKGSWNVNPTNTRAHSETPEIRKYKKRFNS 961 
QPGGSGDSIPITALVGGEGTRLDQLQYDVRKGSWNVNPTNTRAHSETPEIRKYKKRFNS 
NIK : 924 QPGGSGDSIPITALVGGEGTRLDQLQYDVRKGSWNVNPTNTRAHSETPEIRKYKKRFNS 983 

N0V3c : 962 EILCAALWGVNLLVGTENGLMLLDRSGQGKVYGLIGRRRFQQMDVLEGLNLLITISGKRN 
.15 1021 

EILCi^WGWLLVGTENGLMLLDRSGQGECVYGLIGRRRFQQMDVLEGLNLLITISGKRN 
NIK : 984 EILCiUUiWGVNLLVGTENGLMLLDRSGQGKVYGLIGRRRFQQMDVLEGLNLLITISG 
1043 

20 N0V3c: 1022 KLRVYYLSWLRNKILHNDPEVEKKQGWTTVGDMEGCGHYRWKYERIKFLVIALKSSVEV 
1081 

KLRVYYLSWLRNKILHNDPEVEKKQGWTTVGDMEGCGHYRVVKYERIKFLVIALKSSVEV 
NIK : 1044 KLRVYYLSWLRNKILHNDPEVEKKQGWTTVGDMEGCGHYRWKYERIKFLVIALKSSVEV 
1103 

25 

N0V3c: 1082 YAWAPKPYHKFMAFKSFADLPHRPLLVDLTVEEGQRLKVIYGSSAGFHAVDVDSGNSYDI 
1141 

YAWAPKPYHKEmFKSFADLPHRPLLVDIiTVEEGQRLBCVIYGSSAGFHAVDVDSGNSYDI 
NIK : 1104 YAWAPKPYHKFMAFKSFADLPHRPLLVDLTVEEGQRLKVIYGSSAGFHAVDVDSGNSYDI 
30 1163 

N0V3c: 1142 YIPVHIQSQITPHAIIFLPNTDGMEMLLCYEDEGVYVNTYGRIIKDWLQWGEMPTSVAY 
1201 

YI PVHIQSQI TPHAI I FLPNTDGMEMLLCYEDEGVYVNT YGRI IKDWLQWGEMPTS VAY 
35 NIK : 1164 YIPVHIQSQITPHAIIFLPNTDGMEMLLCYEDEGVYVNTYGRIIKDWLQWGEMPTSVAY 
1223 

N0V3c: 1202 ICSNQIMGWGEECAIEIRSVETGHLDGVFMHKRAQRLKFLCERNDKVFFASVRSGGSSQVY 
1261 

40 ICSNQIMGWGEKAIEIRSVETGHLDGVFMHKRAQRLKFLCERNDKVFFASVRSGGSSQVY 
NIK : 1224 ICSNQIMGWGEKAIEIRSVETGHLDGVFMHKRAQRLKFLCERNDKVFFASVRSGGSSQVY 
1283 

N0V3c: 1262 FMTLNRNCIMNW 1273 (SEQ ID NO: 67) 
45 FMTLNRNCIMNW 

NIK : 1284 FMTLNRNCIMNW 1295 (SEQ IS NO: 34) 

TABLE 19 

Score = 2007 bits (5201), Expect =0.0 
50 Identities « 1056/1332 (79%), Positives = 1059/1332 (79%), Gaps = 88/1332 
(6%) 

N0V3c: 1 MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQLAAIKVMDVTX 60 
MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQLAAIKVMDVT 
55 GCK : 1 MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQLAAIKVMDVTE 60* 

N0V3c: 61 XXXXXIKQEINMLKBCYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 120 

IKQEINMLKECYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 
GCK : 61 DEEEEIKQEINMLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 120 
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N0V3c: 121 KGNALKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 180 
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KGNAIJCEIX;iAYICREIIJlGLMLHAHKVIHRDIKGQNVLLTENAEV^ 



TVGRRNTFIGTPYWMZVPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 



ALFLIPRNPPPRLKSKKWSBQCFIDFIDTCLIKTyLSRPPTEQLLKFPFIRDQPTERQVRI 



QLKDHI PSSIMNVPGESTLRREFLRLQQ 



ENKSNSEALK RDPEAHIKHLLH 



DMQAL Y 
3EKEQQRRLEDMQAL RREEERRQAEREQEY 451 

RKXXXXXXXXXXXXXXXXXXHAYLKSXXXXXXXXXXXXXXXXXXXPGDRKPLYHYGRGM 540 
25 KRK HAYLKS PGDRKPLYHYGRGM 



GCK : 


121 


N0V3c: 


181 


GCK : 


181 


NOVSc: 


241 


GCK : 


241 


N0V3c: 


SOI 


GCK : 


SOI 


N0V3c: 


S61 


GCK : 


361 


N0V3c: 


421 


GCK : 


421 


NOVSc: 


481 


GCK : 


452 


NOVSc: 


541 


GCK : 


512 


NOVSc: 


559 


GCK : 


572 


NOVSc: 


602 


GCK : 


632 


NOVSc: 


662 


GCK : 


692 


NOVSc: 


722 


GCK : 


752 


NOVSc: 


782 


GCK : 


812 


NOVSc: 


842 


GCK : 


872 


NOVSc: 


902 


GCK : 


932 


NOVSc: 


962 


1021 





NPADKPAWAREV + P 



LKPYAAP VPRSQSLQDQPTRNLAAFPASHXXXXXXXXXXXXXX 601 

++P P VPRSQSLQDQPTRNLAAFPASH 



RGAVIRQNSDPTSEGPGPSPNPPAWVRPDNEAPPKVPQRTSSIATALNTSGAGGSRPAQ 



AVRASNPDLRRSDPGWERSDSVLPASHGHLPQAGSLERNRVGVSSKPDSSPVLSPGNKAK 



45 PDDHRSRPGRPASYiCRAIGEDFVLLKERTLDEAPRPPKKAMDY 



RDTPGGRSDGDTDSVSTMWHDVEEITGTQPPYGGGTMWQRTPEEERNLLH 
EGGPAEGSRDTPGGRSDGDTDSVSTMWHDVEEITGTQPPYGGGTMWQRTPEEERNLLH 

ADSNGYTNLPDWQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIY 
ADSNGYTNLPDWQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIY 
ADSNGYTNLPDWQESHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIY 

QPGGSGDSIPITALVGGEGTRLDQLQYDVRKGSWNVNPTNTRAHSETPEIRKYKKRFNS 
QPGGSGDSIPITALVGGEGTRLDQLQYDVRKGSWNVNPTNTRAHSETPEIRKYKKRFNS 
QPGGSGDSIPITALVGGEGTRLDQLQYDVRKGSWNVNPTNTRAHSETPEIRBCYKKRFNS 

EILCAALWGVNLLVGTENGLMLLDRSGQGKVYGLIGRRRFQQMDVLEGLNLLITISGKRN 

EILCAALWGVNLLVGTENGLMLLDRSGQGKVYGLIGRRRFQQMDVLEGLNLItlTISGKRN 
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GCK : 992 EILCAALWGVNLLVGTENGLm*LDRSGQGKVYGLIGRRRFQQMDVLEGLNLLITISGI^ 
1051 

NGV3c: 1022 KLRVYYLSWLRNKILHNDPEVEKKQGWTTVGDMEGCGHYRVVKYERIKFLVIALKSSVEV 
5 1081 

KLRVYYLSWLRNKILHNDPEVEKKQGWTTVGDI^GCGHYRVVKYERIKFLVIALKSSVEV 
GCK : 1052 KLRVYYLSWLRNKILHNDPEVEKKQGOTTVGDMEGCGHYRVVKYERIKEXVIALKSSVEV 
1111 

10 N0V3c: 1082 YAWAPKPYHKFMAFKSFADLPHRPLLVDLTVEEGQRLKVIYGSSAGFHAVDVDSGNSYDI 
1141 

YAWAPKPYHKFMArKSFADLPHRPLLVDLTVEEGQRLKVIYGSSAGFHAVDVDSGNSYDI 
GCK : 1112 YAWAPKPYHKFb4AFKSFADLPHRPLLVDLTVEEGQRLKVIYGSSAGFHAVDVDSGNSYDI 
1171 



N0V3c: 1142 YIPVHIQSQITPHAIIFLPNTDGMEMLLCYEDEGVYVNTYGRIIKDWLQWGEMPTSVAY 
1201 

YIPVHIQSQITPHAIIFLPNTDGMEMLLCYEDEGVYVNTYGRIIKDWLQWGEMPTSVAY 
GCK : 1172 YIPVHIQSQITPHAIIFLPNTDGMEMLLCYEDEGVYVNTYGRIIKDWLQWGEMPTSVAY 
20 1231 

• N0V3c: 1202 ICSNQIMGWGEKAIEIRSVETGHLDGVFMHKRAQRLKFLCERNDKVFFASVRSGGSSQVY 
1261 

ICSNQIMGWGEKAIEIRSVETGHLDGVFMHKRAQRLKFLCERNDKVFFASVRSGGSSQVY 
25 GCK : 1232 ICSNQIMGWGEKAIEIRSVETGHLD6VFMHKRAQRLKFLCERNDKVFFASVRSGGSSQVY 
1291 

N0V3c: 1262 FMTLNRNCIMNW 1273 (SEQ ID No: 68) 
FMTLNRNCIMNW 

30 GCK : 1292 FMTLNRNCIMNW 1303 (SEQ ID NO: 35) 



Based on its relatedness to known members of the STE20 family of protein kinases, 
NOV3b provides new diagnostic and therapeutic compositions useful in the treatment of 
35 disorder^ associated with alterations in the expression of members of the STE20 family of 

protein kinases. Nucleic acids, polypeptides, antibodies, and other compositions of the present 
invention are useful in the treatment and diagnosis of a variety of diseases and pathologies, 
including, by way of nonlimiting example, those involving metabolic and endocrine disorders, 
cancer, bone disorders, and tissue/cell growth regulation disorders. 

40 

NOV-3d 

A N0V-3d sequence according to the invention is a nucleic acid sequence encoding a 
polypeptide related to STE20 family of protein kinases, A disclosed NOV-3d nucleic acid and 
its encoded polypeptide includes the sequences shown in Table 20. The disclosed nucleic acid 
45 (SEQ ID NO: 12) is 3735 nucleotides in length and contains an open reading frame (ORF) that 
begins with an ATG initiation codon at nucleotides 1-3 and ends with a TGA stop codon at 
nucleotides 3733-3735. The start and stop codons are shown in bold font. The disclosed, 
respective ORF encodes a 1244 amino add polypeptide (SEQ ID NO: 13). 
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TABLE 20 

J«W3GGCGACCCAGCCCCCGCCCGCAGCCTGGACGACATCGACCTGT^^ 
TGTGGAGGTGGTCGGCaATGGAACCTACGGACAGGTGTACaAfiGGTCGGCATGTC^ 
5 AGGTCATGGATGTCACGGAGGACGAGGAGGAAGAGATCAAACAGGAGATCAACATG^ 

AACATCGCCyiCCTACTACGGAGCCTTCATCAAGAAGAGCCCCCCGGGAAACGATGACCAGCT 
CTGTGGTGCTGGTTCAGT6ACTGACCTGGTAAAGAACACAAAAGGCAACTO 
GCAGGGAGATCCTCAGGGGTCTGGCCCATCTCCATGCCCACaAGGTGATCCATCGAG^ 
CTGACAGAGAATGCTGAGGTCAAGCTAGTGGATTTTGGGGTGAGTGCTCAGCTGGACC© 

10 TTTCATTGGGACTCCCTACTGGATGGCTCCAGAGGTCATCGCCTGTGATGAGAACCCTG^^ 

GTGATATTTGGTCTCTAGGAATCACA6CCATCGAGATGGCAGAGGGAGCCCCCCCTCTGTGTGACATGCACCCCATGCGA 
GCCCTCTTCCTCATTCCTCGGaACCCTCCGCCCAGGCTCyiAGTCCAAGAAGTGGTCTAAGAAGTTCA 
CACATGTCTCATCAAGACTTACCTGAGCCGCCCACCCACGGAGGAGCrACTGAAGTTTCCCTTC^ 
CGGAGCGGCAGGTCCGCATCCAGCTTAAGGACCACATTGACCGATCCCGGAAGAAGCGGGGTGAGAAAGAGGAGAC^ 

1 5 TATGAGTACAGCGGCAGCGAGGAGGAAGATGACAGCCATGGAGAGGAAGGAGAGCCAAGCTCCATC^^ 
AGAGTCGACrCrTACGCCGGGAGTTTCTCCGGCTCCAGCAGGAAAATAAGAGCAACTCAGAGGCT 
AGCTGCAGCAGCAGmGCAGCGAGACCCCGAGGCA(^CATC:MA»CCT 
CS^GT^GGAGGAGCGGCGCCGCGTGGAGGAGCAACAGCGGCGGGAGCGGGAGCAGCGGAATC 
GCGGCGGCTGGAGGACATGCAGGCTCTGCGGCGGGAGGAGGAGCGGCGGCAGGCGGAGCGCGAGC^^ 

20 ACAGGCTAGAGGAGCAGCGGCAGTCAGAACGTCTCCAGAGGCAGCTGaVGCAGGAGCATGC^ 
CAGCAGCAACAGCAGCTIGCAGCI^CAGAAACAGCAGCAGCAGCAGCT 

TGGTCGGGGCATGAATCCCGCTGACAAACCAGCCTGGGCCCGAGAGGTAGTGGCACyiCCGGGTCCC^ 

CAGCACCTGTACCCCGATCCCAGTCCCTGCAGGACCAGCCCACCCGAAACCTGGCrrGCCTTC^ 

GACCCTGCCATCCCCGCACCCTiCrrGCCACGCCCAGTGCCCGAGGAGCTGTCATCCGCaVGAATTC^^ 

25 AGGACCTGGCCCCAGCCCGAATCCCCCAGCCTGGGTCCGCCCAGATAACGAGGCCCCACCCAAGGTGCCTCAGAGGACCT 
CATCTATCGCCACTGCCCTTAACACCAGTGGGGCCGGAGGGTCCCGGCCAGCCCAGGCAGTCCGTGCCAGT 
CTCAGGAGGAGCGACCCTGGCTGGGAACGCTCGGACAGCGTCCTTCCAGCCTCTCACGGGCACCT 
ACTGGAGCGGAACCGCGTGGGAGTCTCCTCCAAACCGGACAGCrrCCCCTGTGCTCT^ 
ACGACCACCGCTCACGGCCaGGCCGGCCCGCAAGCTATAAGCGAGCAATTGGT^ 

30 ACTCTGGACGAGGCCCCTCGGCCTCCCAAGAAGGCCaVTGGACTACTCGTCGTCCAGCGAGGAGGTGGAAAGCAG^ 
CGACGAGGAGGAAGGCGAAGGCGGGCCAGCAGAGGGGAGCAGAGATACCCCTGGGGGCCGCAG^ 
GCGTCAGCACCaTGGTGGTCGAOSACGTCGflGGAGATCACCGGGACCCAGCCCCCATACGGGG© 
CAGCGCACCCCTGAAGAGGAGOSGAACCTGCTGCATGCTGACAGCAATGGGTACACAfl^ 
CAGCCACTCACCCACCGAGAACAGCAAAGGCCAAAGCCCACCCTCGAT^GGATGGGAGTGGTGACTA^ 

35 TGGTAAAGGCCCCTGGCAAGAGCTCGTTCACGATGTTTGTGGATCTAGGGArCTACCAGCCTGGAGGC^ 

ATCCCCATCACAGCCCTAGTGGGTGGAGAGGGCACTCGGCTCGACCAGCTGCAGTACGACGTGAGGAAGGOT 
CAACGTGAATCCCACCAACy^CCCGGGCCCACAGTGAGACCCCTGAGATCCGGAAGTAa^GAAG^ 
TCCTCTGTGCAGCCCTTTGGGGGGTCAACCTGCTGGTGGGCACGGAGAACGGGCTGATGTTGCTGGACCGAA6TGGGCA6 
GGCAAGGTGTATGGACTCATTGGGCGGCGACGCTTCCAGCyVGATGGATGTGCTGGAGGGGCTCAA 

40 CTCAGGGAAAAGGAACAZVACTGCGGGTGTATTACCTGTCCTGGCTCCGGAACAAGATTCTGCACAATGACCCAGAAGT 
AGAAGAAGCAGGGCTGGACCACCGT6GGGGACATGGAGGGCTGCGGGCACTACCGTGTTGTGAAATACGAGCGGATTM 
TTCCTGGTCATCGCCCTCaAGAGCTCCGTGGAGGTGTATGCCTGGGCCCCCAAACCCTACCACAAATTCATGGCCTO 
GTCCTTTGCCGACCTCCCCCACCGCCCTCTGCTGGTCGACCTGACAGTAGAGGAGGGGCaGCGGCTCAAGGTCATCT 
GCTCCAGTGCTGGCTTCC3VTGCTGTGGATGTCGACTCGGGGAACAGCTATGACATCT 

45 CAGATCACGCCCCATGCCATCATCTTCCTCCCCAACACCGACGGCATGGAGATGCTGCTGTGCTACGAGGA 

CTACGTCAACACGTACGGGCGCATCATTAAGGATGTGGTGCTGCAGTGGGGGGAGATGCCTACTTCTGTGGCCTACATCT 
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GCTCCAACCAGATAATGGGCTGGGGTGAGAAAGCCATTGA^GATCCGCTCTGTGGAGACGGGCCACCT 
ATGCACaAACGAGCTCAGAGGCTCAAGTTCCTGTGTGAGCGG^ 

CAGCAGCCAAGTTTACTTCATGACTCTGaACCGTAACTGCATCATGAACTGGTGA (SEQ ID NO: 12) 

5 MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTyGQVYKGRH\^GQLAAIKV^ 
NlATYYGAFIKKSPPGNDDQLWLVMErcGAGSVTDLVKNTKGNAIiKEDCIAYICRE 
LTENAEVKLVDroVSAQLDRTVGRRNTFIGTPYWMAPE\aACDENPDATYDYRSDIffSI.6IT^^ 
ALHTiIPiytTPPPRIiKSKKWSKKFlDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTER^ 
YEYSGSEEEDDSHGEEGEPSSIMNVPGESTIjmEFIiRIiQQENKSNSEAIJCQCXXSLQCKXK^ 
1 0 QKEERRRVEEQQRREREQRKLQEKEQQRRIiEDMQALRREEERRQAEREQEYIiam^ 

QQCXXKX5LQKQQQQQLLPGDRKPLYHYGRGWNPADKPAWAREWAHRVPIJCPYAAPVPRSQ^ 

DPAIPAPTATPSTIRGAVIRQNSDPTSEGPGPSPNPPAWVRPDNEAPPKVPQRTSSIATAIJJTSGAGGSRPAQAVRASNP^ 
LRRSDPGWERSDSVLPASHGHLPQAGSI£RNRVGVSSKPDSSPVLSPGNKAKPDDHRSRPGRPM 
TLDEAPRPPKKAMDYSSSSEEVESSEDDEEEGEGGPAEGSRDTPGGRSDGOTDSVSTMVVHDVEEITGTQPPY 
15 QRTPEEERNLLHADSNGYTNLPDWQPSHSPTENSKGQ5PPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIYQPGGSGDS 
IPITALVGGEGTRLMIK2YDVRKGSVVNVNPTNTRAHSETPEIRKYKKRFNSEILCai^ 
GK\^GLIGRRRFQQMDVLEGLNLLITI3GKRNKLRVYYLSWLRNKIIJINDP 
FLVIAIJCSSVEVYAWAPKPYHKFMAFKSFADLPHRPLLVDLTVEEGQRIJCVIYGSSAGFH^ 
QITPHAIIFLPNTDGMEMLLCYEDEGVYVNTYGRIIKDVVLQW6EMPTSVAYICSNQIMGWGEKAIEIRSTO 

20 MHKRAQRLKFLCERNDKVFFASVRSGGSSQVYFMTIJJRNCIMNW (SEQ ID NO: 13) 

The disclosed N0V-3d nucleic acid sequence has homology (73% identity) to a mouse 
mRNA for a NIK protein (NIK) (GenBank Accession No: MMU88984), as shown in table 
21 . NIK proteins axe a subgroup of the STE20 family of protein kinases. As indicated by the 

25 '*Expecf' value, the probability of this alignment occurring by chance alone is 2-2e-295. 

Moreover, the disclosed, encoded amino acid sequence has 1046 of 1303 amino acid residues 
(80%) identical to a human NIK-related protein (GenBank Accession No: BAA90753), shown 
in Table 22. Furthermore, the disclosed, encoded amino acid sequence also has homology 
(80%^identity) to a human GCK kinase (GenBank Accession No: BAA94838), another 

30 subgroiq) of the STE20 kinase family, as shown in Table 23, As indicated by the *'Expece* 
value, the probability of these amino acid alignments occurring by chance alone are both 0.0, 
the lowest probability score. 
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TABLE 21 

Score 3832 (575.0 bits). Expect = 2.2e-295, Sum P(2) = 2.2e-295 
Identities = 1260/1725 (73%), Positives ^ 1260/1725 (73%), Strand - Plus / 
Plus 

N0V3d: 4 GGCGACCCAGCC-CCCGCCCGCAGCCTGGACGACATCGACCTGTCCGCCCTGCGGGACCC 62 

GGCGA C A C CCCGC AGCCTGG GACAT GACCTGTC CCCTGCGGGACCC 
NIK : 3 GGCGAACGACTCTCCCGCGAAGAGCCTGGTGGACATTGACCTGTCGTCCCTGCGGGACCC 62 

N0V3d : 63 TGCTGGGATCTTTGAGCTTGTGGAGGTGGTCGGCAATGGAACCTACGGACAGGTGTACAA 
122 

TGCTGGGAT TTTGAGCT GTGGA GTGGT GG AATGG ACCTA GGACA GT TA AA 
NIK : 63 TGCTGGGATTTTTGAGCTGGTGGAAGTGGrTGGAAATGGCACCTATGGACAAGTCTATAA 
122 

N0V3d: 123 GGGTCGGCATGTCAAGACGGGGCAGCTGGCTGCCATCAAGGTCATGGATGTCACGGAGGA 
182 

GGGTCG CATGT AA ACGG CA CTG C GCCATCAAGGT ATGGA GTCAC GAGGA 
NIK : 123 GGGTCGACATGTTAAAACGGT-CA-CTGCC-GCCATCAAGGTTATGGACGTCACCGAGGA 
179 



N0V3d: 183 CGAGGAGGAAGAGATCAAACAGGAGATCAACATGCTGAAAAAGTACTCTCACGACCGCAA 
242 

GA GAGGAAGA ATCA AC GGAGAT AA ATGCTGAA AAGTA TCTCA CA CG AA 
NIK : 180 TGAAGAGGAAGAAATCACACTGGAGATAAATATGCTGAAGAAGTATTCTCATCATCGAAA 
239 

N0V3d: 243 CATCGCCACCTACTACGGAGCCTTCATCAAGAAGAGCCCCCCGGGAAACGATGACCAGCT 
302 

AT GCCAC TACTA GG GO TTCAT AAGAAGAGCCC CO GGA A GATGACCA CT 
NIK : 240 TATTGCCACGTACTATGGTGCTTTCATTAAGAAGAGCCCTCCAGGACATGATGACCAACT 
299 



N0V3d : 303 CTGGCTGGTGATGGAGTTCTGTGGTGCTGGTTCAGTGACTGACCTGGTAAAGAACACAAA 
362 

CTGGCT GT ATGGAGTT TGTGG GCTGG TC T AC GACCT GT AAGAACAC AA 
NIK : 300 CTGGCTTGTTATGGAGTTTTGTGGGGCTGGGTCCATCACAGACCTTGTGAAGAACACCAA 
359 

N0V3d : 363 AGGCAACGCCCTGAAGGAGGACTGTATCGCCTATATCTGCAGGGAGATCCTCAGGGGTCT 
422 

AGG AAC C CT AA GA GACTG AT GC TA ATCT CAGGGA ATCCTCAGGGG T 
NIK : 360 AGGGAACACTCTCAAAGAAGACTGGATTGCTTACATCTCCAGGGAAATCCTCAGGGGATT 
419 

N0V3d : 423 GGCCCATCTCCATGCCCACAAGGTGATCCATCGAGACATCAAGGGGCAGAATGTGCTGCT 
482 

GGC CATCTCCAT CAC A GT AT CA CGAGA ATCAAGGG CA AATGTGCTGCT 
NIK : 420 GGCACATCTCCATATTCACCACGTTATTCACCGAGATATCAAGGGCCAAAATGTGCTGCT 
479 

N0V3d: 483 GACAGAGAATGCTGAGGTCAAGCTAGTGGATTTTGGGGTGAGTGCTCAGCTGGACCGCAC 
542 

GAC GAGAATGCTGAGGT AA CT GT GATTTTGG GT AG GCTCAGCTGGAC G AC 
NIK : 480 GACCGAGAATGCTGAGGTGAAACTTGTTGATTTTGGTGTAAGCGCTCAGCTGGACAGGAC 
539 

N0V3d : 543 CGTGGG-CAGACGGAACACTTTCATTGGGACTCCCTACTGGATGGCTCCAGAGGTCATCG 
601 

GT GG C GA G AA AC TTCAT GG AC CCCTACTGGATGGCTCCAGAGGTCATCG 
NIK : 540 GGTTGGACGGA-GAAATACGTTCATAGGCACACCCTACTGGATGGCTCCAGAGGTCATCG 
598 
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N0V3d: 
660 

5 NIK ' : 
657 

N0V3d: 
720 

10 

NIK : 
717 

N0V3d: 
15 780 

NIK : 
111 

20 NOVBd: 
838 

NIK : 
835 

25 

N0V3d: 
898 

NIK : 
30 895 

NdV3d: 
958 

35 NIK : 
955 

NOVBd: 
1014 

40 

NIK : 
1014 

N0V3d: 
45 1074 

NIK : 
1074 

50 N0V3d: 
1132 

NIK : 
1132 

55 

N0V3d: 
1192 



NIK : 
60 1192 



602 CCTGTGATGAGAACCCTGATGCCACCTATGATTACAGGAGTGATATTTGGTCTCTA-GGA 

CCTGTGATGAGAACCC GA GCCAC TA GA TACAG AGTGA T TGGTC CT GG 
599 CCTGTGATGAGAACCCAGACGCCACTTACGACTACAGAAGTGACCTCTGGTC-CTGTGGC 



661 ATCACAGCCATCGAGATGGCAGAGGGAGCCCCCCCTCTGTGTGACATGCACCCCATGCGA 

ATCACAGCCATCGAGATGGC GA GG G CCCCCCTCT TGTGACATGCA CC ATG GA 
658 ATCACAGCCAfCGAGATGGCTGTVAGGGGGCCCCCCTCTCTGTGACATGCATCCAATGAGA 



721 GCCCTCTTCCTCATTCCTCGGAACCCTCCGCCCAGGCTCAAGTCCAAGAAGTGGTCTAAG 

GC CT TT CTCAT CC G AACCCTCC CCCAGGCT AAGTC AA AA TGGTC AAG 
718 GCGCTGTTTCTCATCCCCAGAAACCCTCCTCCCAGGCTGAAGTCAAAAAAATGGTCAAAG 



781 AAGTTCATTGA-CTTCATTGACACATGTCTCATCAAGACTTACCTG- AGCCGCCCACCCA 

AA TT TT A CTT AT GA TGTCT T AAGA TTAC TG AGC GCCC C A 
778 AAATTT-TTCAGCTTTATAGAAGGCTGTCTGGTGAAGAATTACATGCAGCGGCCCTCT-A 



839 CGGAGCAGCTACTGAAGTTTCCCTTCATCCGGGACCAGCCCACGGAGCGGCAGGTCCGCA 

C GAGCA CT T AA CC TTCAT GGGA CAGCCCA GA GGCAGGT CG A 
836 CAGAGCAACTTTTAAAACACCCTTTCATAAGGGATCAGCCCAATGAAAGGCAGGTTCGAA 



899 TCCAGCTTAAGGACCACATTGACCGATCCCGGAAGAAGCGGGGTGAG7VAAGAGGAGACAG 

TCCAGCTTAAGGA CACAT GACCG CC G AAGAAG G GG GAGAAAGA GAGAC G 
896 TCCAGCTTAAGGATCACATAGACCGGACCAGAAAGAAGAGAGGCGAGAAAGATGAGACGG 



959 AATATGAGTACAGCGGCAGCGAGGAGGAAGATGACt A-GC-CATGGAG-AGGAAGGAGAG 

A TA GAGTACAGCGG AGCGAGGAGGA GA GA AG C TG AG AGGA GGAGAG 
95 6 AGTACGAGTACAGCGGGAGCGAGGAGGAGGAGGAGGAAGTGCCTG-AGCAGGAGGGAGAG 

1015 CCAAGCTCCATCATGAACGTGCCTGGAGAGTCGACTCTACGCCGGGAGTTTCTCCGGCTC 

CCAAG TCCATC T AA GTGCCTGGAGAGTC ACTCT CG CG GA TT CT G CT 
1015 CCAAGTTCCATCGTCAATGTGCCTGGAGAGTCAACTCTGCGACGT6ATTTCCTGAGACTG 

t 

1075 CAGCAGGAAAATAAG- AGCAACTCAGAGGCTTTAAAACAG-CAGCAGCAGCTGCAGCAGC 

CAGCAGGA AA AAG AGC TC GAGGCT T AG CAGCAGC CTGCAG AGC 

1075 CAGCAGGAGAACAAGGAGCGG-TCTGAGGCTCTGCGG-AGACAGCAGCTTCTGCAGGAGC 



1133 AGCAGCAGCGAGACCCCGAGGCACACATCAAACACCTGCTGCACCAGCGGCAGCGGCGCA 

AGCAGC CG GA C GAGG A A A CA CTGCTG AG GGCAG GCG A 
1133 AGCAGCTCCGGGAGCAGGAGGAGTATAAGAGGCAGCTGCTGGCTGAGAGGCAGAAGCGGA 



N0V3d: 1193 TAGAGGAGCAGAAGGAGGAGCGGCGCCGCGTGGAGGAGCAACAGCGGCGGGAGCGGGA-G 
1251 
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NIK 
1252 



T GA AGCAGAA GA AG GG G CG TGGA GAGCAACA G G GA CGGGA G 
1193 TTGAACAGCAGAAAGAACAGAGGAGGCGGCTGGAAGAGCAACAAAGAAGAGAACGGGAAG 



10 



15 



N0V3d: 
1307 

NIK : 
1310 

N0V3d: 
1365 

NIK : 
1366 



1252 CAGCGGAAGCTGCAGGAGAAGGAGCAGCAGCGGCG--G~CTGGAGGACATGCAGGC"-TCT 

C GGA GC GCAGGAG GAGCAGC GCGGCG G C GAGGA A G AGGC TCT 

1253 CCA-'GGAGGCAGCAGGAGCGTGAGCAGCGGCGGCGTGAACAAGAGGAGAAG-AGGCGTCT 



1308 GCGGCGGGA—GGAGGAGCGGCGGCAGGCGGAGCGCGAGCAGGAATATATTCGTCACAGG 



1311 



CG GG A GGA GCGGCG A G GAG GAG AGGA A C A AGG 
-CGA-GGAACTGGAAAGGCGGCGTAAAGAAGAGGAAGAG-AGGAG-ACGGGCAGAAGAGG 



20 



N0V3d: 
1418 

NIK : 
1424 



1366 CTA-GAGGAGCAGCGGC-AGT CAGAACGT-CTCCAGA-GGCAGCTGCAGCAGGAGCA 

A GAGGAG AG GG AG CAG A GT C CAG GGCAGCT AG AGGAGCA 

1367 AGAAGAGGAG-AGTGGAGAGGGAACAGGA-GTACATCAGGCGGCAGCTAGAGGAGGAGCA 



25 



N0V3d: 
1475 

NIK : 
1482 



1419 T-GCCTACCTCAAGTCCCTGCAGCAGCAGCAACAGCAGCAGCAGCTTCA-GAAACAGCA- 

G C ACCT AG CCTGCAGCAGCAGC C CAG AGCAG CA G AC GCA 
1425 GCGGC-ACCTGGAGATCCTGCAGCAGCAGCTGCTCCAGGAGCAGGC-CATGTTACTGCAC 



30 



35 



40 



45 



N0V3d: 
1533 

NIK : 
1534 

N0V3d: 
1589 

NIK : 
1590 

N0V3d: 
1646 

NIK : 
1649 



1476 G-CAGCAGGAGCTCCTGCCTGGGGA-CAGGAAGCCCCTGTACCATTATGGTCGGGGCATG 

G C CAG AG CC GC G A CAG A GCC CGCC A G C G CA G 
1483 GACCACAGGAGG-CC-GCACGCACAGCAGCA-GCCGCCGCCCCCGCA~G-CAG~CAGG 



1534 AATCCCGCTGACAAACC-AGCCTGG~GCCCGAGAGGTAGTGGCACACCGGGTCCCA-CT 

A C G G CAAACC AGC T GC C AGAG G C CAC G CCC CT 

1535 A — CAGGA-G-CAAACCGAGCTTTCATGCTCCAGAGCCCAAGCCTCACTATGACCCTGCT 



1590 GA-AGCCATATGCAGC-ACC-TGTACCCCGATCCCAGTCCCTGCAGGACCAGCCCACCCG 

GA AG T G AG AC TG CCC T CA TC CT AG AC A C CCC 

1591 GACAGAGCTCGGGAGGTACAGTGGTCCCACCTGGCA-TCTCTCAAGAACAATGTCTCCCC 



50 



N0V3d: 
1701 

NIK : 
1708 



1647 AAACCTG-GCTGCC-TTCC—CAGCCTCCCATGACCCCGACCCTGC-CATCCCCGCACCC 

C G G T CC TTCC CAG CCC T CCC A C GC CA C CC CC 
1650 TGTCTCGAGATCCCATTCCTTCAGTGACCCTTCTCCCAAATTC-GCACACCACCATCTCC 



55 



N0V3d: 1702 ACTGCCACG-CCCAGTGCCC 1720 (SEQ ID NO: 69) 

CT CA 6 CCCA TG CC 
NIK : 1709 GCTCTCAGGACCCA-TGTCC 1727 (SEQ ID NO: 36) 
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TABLE 22 

Score = 1995 bits (5170), Expect 0.0 

Identities - 1046/1303 (80%), Positives - 1049/1303 (80%), Gaps = 67/1303 

MGDPAPARSLDDI DLSALRDPAGI FELVEWGNGT YGQVYKGRHVKTGQIAAIKVMDVTX 6 0 
MGDPAPARSLDDIDLSMiRDPAGIFELVEWGNGTYGQVYKGRHVKTGQLAAIKV^ 
MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVTKGRHVKTGQIiAAIKVMDV^ 60 

10 N0V3d: 61 XXXXXIKQEINMLBCKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLA^ 120 

IKQEINMIiKBCYSHHRNIATYYGAFIKKSPPGNDlXJLWLVMEFCGAGSVTDLVKNT 
DEEEEIKQEIimLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 120 

KGNALKEDCIAYICREILRGIAHLHAHKVIHRDIKGQNVLLTENAEAhCLVDFG^^ 180 
15 KGNALKEDCIAYICREILRGIoAHLHAHKVIHRDIKGQNVIJbTENAEVKLVDFGVSAQLDR 



20 



25 



30 



40 



45 



50 



60 



(5%) 




N0V3d: 


1 


NIK : 


1 


N0V3d: 


61 


NIK : 


61 


N0V3d: 


121 


NIK : 


121 


N0V3d: 


181 


NIK : 


181 


N0V3d: 


241 


NIK : 


241 


N0V3d: 


301 


NIK : 


301 


N0V3d: 


361 


NIK : 


361 


N0V3d: 


421 


NIK : 


421 


N0V3d: 


481 


NIK : 


481 


N0V3d: 


527 


NIK : 


541 


N0V3d: 


542 


NIK : 


601 


N0V3d: 


602 


NIK : 


661 


N0V3d: 


662 


NIK : 


721 


N0V3d: 


722 


NIK ; 


773 


N0V3d: 


782 



TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 



ALFLIPRNPPPRLKSKBCWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 



QLKDHI PSSIMNVPGESTLRREFXiRLQQ 



ENKSNSEALK RDPEAHIKHLLH 



35 DMQAL Y R HAYIiKS 



CPGDRKPLYHYGRGMNPADKPAWAREWAH 526 

PGDRKPLYHYGRGMNPADKPAWAREV 



-RVPLKPYAAP VPRSQ 541 

+ P++P P VPRSQ 



SLQDQPTRNLAAFPASH RGAVIRQNSDPTSEGPGPSPNPPAWVRP 



DNEAPPKVPQRTSSIATALNTSGAGGSRPAQAVRASNPDLRRSDPGWERSDSVLPASHGH 



55 LPQAGSLERNRVGVSSKPDSSPVLSPGNKAKPDDHRSRPGRPA DFVLLKERT 

PQAGSLERNRVGVSSKPDSSPVLSPGNKAKPDDHRSRPGRPA DFVLLKERT 772 



LDEAPRPPKKAMDY RDTPGGEISDGDTDSVSTMWH 
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DVEEITGTQPPYGGGTMWQRTPEEERNLLHADSNGYTNLPDWQPSHSPTENSKGQSPP 
NIK : 833 DVEEITGTQPPYGGGTMWQRTPEEERNLLHADSNGYTNLPDWQPSHSPTENSKGQSPP 892 

N0V3d: 842 SKDGSGDYQSRGLVKAPGKSSFTMFVDLGIYQPGGSGDSIPITALVGGEGTRLDQLQYDV 901 

SKDGSGDYQSRGLVKAPGKSSFTMFVDLGIYQPGGSGDSIPITAIiVGGEGTRLlDQLQYDV 
NIK : 893 SKDGSGDYQSRGLVKAPGKSSFTMFVDLGIYQPGGSGDSXPITALVGGEGTRLDQLQYDV 952 

N0V3d: 902 RKGSVVNVNPTNTRAHSETPEIRKYKKRraSEILCAALWGVNLLVGTENGLMLLDRSGQ 961 

RKGSVVNVNPTNTRAHSETPEIRBCYKBCRFNSEILCAALWGVNLLVGTENGLMLLD^^ 
NIK : 953 RKGSWNVNPTNTRAHSETPEIRKYKKRFNSEILCAALWGVNLLVGTENGLMLLDRSGQG 
1012 

N0V3d: 962 KVYGLIGRRRFQQMDVLEGLNLLITISGKRNKLRVYYLSWLRNKILHNDPEVEKKQGWTT 
1021 



BCVYGLIGRRRFQQMDVLEGLNLLITISGKRNKLRVYYLSWLRNKILHNDPEVEKKQGWTT 
NIK : 1013 KVYGLIGRRRFQQMDVLEGLNLLITISGKRNKLRVYYLSWLRNKILHNDPEVEKKQGWTT 
1072 



N0V3d: 1022 VGDMEGCGHYRWKYBRIKFLVIALKSSVEVYAWAPKPYHKFMAFKSFADLPHRPLLVDL 
1081 

VGDMEGCGHYRWKYERIKFLVIALKSSVEVYAWAPKPYHKFMAFKSFADLPHRPLLVDL 
NIK : 1073 VGDMEGCGHYRWKYERIKFLVlALKSSVEVYAWAPKPYHKBmFKSFADLPHRPLLVDL 
1132 

N0V3d: 1082 TVEEGQRLKVIYGSSAGFHAVDVDSGNSYDIYIPVHIQSQITPHAIIFLPNTDGMEMLLC 
1141 

TVEEGQRLKVIYGSSAGFHAVDVDSGNSYDIYIPVHIQSQITPHAIIFLPNTDGMEMLLC 
NIK : 1133 TVEEGQRLKVIYGSSAGFHAVDVDSGNSYDIYIPVHIQSQITPHAIIFLPNTDGMEMLLC 
1192 

N0V3d: 1142 YEDEGVYVNTYGRIIKDWLQWGEMPTSVAYICSNQIMGWGEKAIEIRSVETGHLDGVFM 
1201 

YEDEGVYVNTYGRIIKDWLQWGEMPTSVAYICSNQIMGWGEKAIEIE^VETGHLDGVFM 
NIK : 1193 YEDEGVYVNTYGRIIKDWLQWGEMPTSVAYICSNQIMGWGEKAIEIRSVETGHLDGVFM 
1252 

N0V3d: 1202 HKRAQRLKFLCERNDKVFFASVRSGGSSQVYFMTLNRNCIMNW 1244 (SEQ ID NO: 70) 

hkraqrlkflcerndkvffasvrsggssqvy™tlnrncimnw 

NIK : 1253 HKRAQRLKFLCERNDBCVFFASVRSGGSSQVYFMTLNRNCIMNW 1295 (SEQ ID NO: 37) 

TABLE 23 

Score = 2018 bits (5228), Expect = 0.0 

Identities = 1054/1303 (80%), Positives = 1057/1303 (80%), Gaps = 59/1303 
(4%) 

N0V3d: 1 MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVBCTGQLAAIKVMDVTX 60 

MGDPAPARSLDDIDLSALEIDPAGIFELVEWGNGTYGQVYKGRHVKTGQLAAIKVMDVT 
GCK : 1 MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQLAAIKVMDVTE 60 

N0V3d: 61 XXXXXIKQEINMLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 120 

IKQEINMLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 
GCK : 61 DEEEEIKQEINMLKKYSHHRNIATYYGAFIBOCSPPGNDDQLWLVMEFCGAGSVTDLVKNT 120 

N0V3d: 121 KGNALKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 180 

KGNALKEDCIAYICREILRGLAHLHAHBCVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 
GCK : 121 KGNALKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 180 

N0V3d: 181 TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 240 

TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 
GCK : 181 TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 240 
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ALFLIPRNPPPRLKSECKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 
ALFLIPRNPPPRLKSECKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 
ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 

QLKDHIXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXPSSIMNVPGESTLRREFLRLQQ 
QLBCDHI PSSIMNVPGESTLRREFLRLQQ 
QLKDHIDRSRKKRGEKEETEYEYSGSEEEDDSHGEEGEPSSIMNVPGESTLRREFLRLQQ 

ENKSNSEALKXXXXXXXXXXRDPEAHIKHLLHXXXXXXXXXXXXXXXXXXXXXXXXXXXX 
10 ENKSNSEALK RDPEAHIKHLLH 



15 



20 



25 



30 DNEAPPKVPQRTSSIATALNTSGAGGSRPAQAVRASNPDLRRSDPGWERSDSVLPASHGH 



35 



40 



45 



55 



60 



N0V3d: 


241 


GCK : 


241 


N0V3d: 


301 


GCK : 


301 


N0V3ci: 


361 


GCK : 


361 


NOVSd: 


421 


GCK ; 


421 


NOVSd: 


481 


GCK : 


481 


NOVSd: 


527 


GCK : 


541 


NOVSd: 


542 


GCK : 


601 


NOVSd: 


602 


GCK : 


661 


NOVSd: 


662 


GCK : 


721 


NOVSd: 


722 


GCK : 


781 


NOVSd: 


782 


GCK : 


841 


NOVSd: 


842 


GCK : 


901 


NOVSd: 


902 


GCK : 


961 


1020 




NOVSd: 


962 


1021 




GCK : 


1021 


1080 




NOVSd: 


1022 


1081 





DMQAL Y R HAYLKS 



PGDRKPLYHYGRGMNPADKPAWAREV 



-RVPLKPYAAP VPRSQ 541 

+ P++P P VPRSQ 



SLQDQPTRNLAAFPASH RGAVIRQNSDPTSEGPGPSPNPPAWVRP 



LPQAGSLERNRVGVSSKPDSSPVLSPGNiCZ^PDDHRSRPGRPASYKRAIGEDFVLLKERT 



LDEAPRPPKKAMDY RDTPGGRSDGDTDSVSTMWH 



DVEEITGTQPPYGGGTMWQRTPEEERNLLHADSNGYTNLPDWQPSHSPTENSKGQSPP 



SKDGSGDYQSRGLVKAPGKSSFTMFVDLGIYQPGGSGDSIPITALVGGEGTRLDQLQYDV 



50 RKGSVVNVNPTNTRAHSETPEIRKYKKRFNSEILCAALWGVNLLVGTENGLMLLDRSGQG 



KVYGLIGRRRFQQMDVLEGLNLLiriSGKRNKLRVYYLSWLRNKILHNDPEVEKKQGWTT 



VGDMEGCGHYRVVKYERIKFLVIALKSSVEVYAWAPKPYHKFMAFKSFADLPHRPLLVDL 
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GCK : 1081 VGDMEGCGHYRVVKYERIKFLVIALKSSVEVYAWAPKPYHKFMAFKSFADLPHtlPLLVDL 
1140 

N0V3CI: 1082 TVEEGQRLKVIYGSSAGFHAVDVDSGNSYDIYIPVfllQSQITPHAIIFLPNTDGMEMLLC 
5 1141 

TVEEGQRLKVIYGSSAGFHAVDVDSGNSYDIYlPVHIQSQITPHAIIBtiPNTDGMEMLLC 
GCK : 1141 TVEEGQRLKVIYGSSAGFHAVDVDSGNSYDIYIPVHIQSQITPHAIIFLPNTDGMEMLLC 
1200 

10 N0V3d: 1142 YEDEGVYVNTYGRIIKDWLQWGEMPTSVAYICSNQIMGWGEKAIEIRSVETGHLDGVm 
1201 

YEDEGVYVNTYGRIIKDWLQWGEMPTSVAYICSNQIMGWGEKAIEIRSVETGHLDGVFM 
GCK : 1201 YEDEGVYVNTYGRIIKDWLQWGEMPTSVAYICSNQIMGWGEKAIEIRSVETGHLDGVEW 
1260 

15 

N0V3d: 1202 HKRAQRLBCFLCERNDKVFFASVRSGGSSQVYEWLNRNCIMNW 1244 (SEQ ID NO: 71) 

HKRAQRLKFLCERNDKVFFASVRSGGSSQVYB4TLNRNCIMNW 
GCK : 1261 HKRAQRLKFLCERNDKVFFASVRSGGSSQVYFMTLNRNCIMNW 1303 (SEQ ID NO: 38) 

20 

Based on its relatedness to known members of the STE20 family of protein kinases, 
N0V3d provides new diagnostic and therapeutic compositions useful in the treatment of 
disorders associated with alterations in the ejq)Fession of members of Ike STE20 family of 
protein kinases. Nucleic acids, polypeptides, antibodies, and other con^ositions of the present 

25 invention are useful in the treatment and diagnosis of a variety of diseases and pathologies, 

including, by way of nonlimiting example, those involving metabolic and endocrine disorders, 
cancer, bone disorders, and tissue/ceU growth regulation disorders. 

Table 24 shows ainultiple sequence aligimient of the disclosed NOV-3 polypeptides 
with a STE20 protein (GenBank Accession No: BAA90753), indicating the homology 

30 between the present invention and a known member of the protein &mily. 



TABLE 24 



35 



STE20 
N0V3b 
N0V3a 
N0V3d 
N0V3c 



MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQIiAAIBCVMDVTE 
MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQLAAIKVMDVTE 
MGDPAPARSLDDI DLSALRDPAGI FELVE WGNGT YGQVYKGRHVKTGQLAAIKVMDVTE 
MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQLAAIKVMDVTE 
MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQIiAAIKVMDVTE 



40 STE20 DEEEEIKQEINMLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 

N0V3b DEEEEIKQEINMIiKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 

N0V3a DEEEEIKQEINMIiKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 

N0V3d DEEEEIKQEINMLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 

N0V3c DEEEEIKQEINMLKBCYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 

45 ************************************************************ 



50 



STE20 
NOVBb 
N0V3a 
N0V3d 
N0V3c 



KGNALKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 
KGNALKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 
KGNALKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 
KGNALKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 
KGNALKEDCIAYICREILRGIiAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 
************************************************************ 
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STE20 
N0V3b 
N0V3a 
N0V3d 
N0V3C 



WGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCr^lHPMR 

TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 

TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 

TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 

TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSliGITAIEMAEGAPPLCDMHPMR 
***************************************** jr^f ***************** 



10 



15 
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STE20 
NOVSb 
N0V3a 
N0V3d 
N0V3c 



STE20 
N0V3b 
N0V3a 
N0V3d 
N0V3C 



ALFLIPRNPPPRLKSKKffSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 

ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQUiKFPFIRDQPTERQVRI 

ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLECFPFIRDQPTERQVRI 

ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 

ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 
********************************************* 

QLKDHIDRSRKKRGEKEETEYEYSGSEEEDDSHGEEGEPSSIMNVPGESTLEIREFLRLQQ 

QLKDHIDRSRKKRGEKEETEYEYSGSEEEDDSHGEEGEPSSIMNVPGESTLRREFLRLQQ 

QLKDHIDRSRKKRGEKEETEYEYSGSEEEDDSHGEEGEPSSIMNVPGESTLRREFLRLQQ 

QLKDHIDRSRKKRGEKEETEYEYSGSEEEDDSHGEEGEPSSIMNVPGESTLRREFLRLQQ 

QLKDHIDRSRKKRGEKEETEYEYSGSEEEDDSHGEEGEPSSIMNVPGESTLRREFLRLQQ 
******************************************************* **^** 



25 



30 
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STE20 
N0V3b 
N0V3a 
N0V3d 
N0V3c 



STE20 
N0V3b 
N0V3a 
N0V3d 
N0V3c 



ENKSNSEALKQQQQLQQQQQRDPEAHIKHLLHQRQRRIEEQKEERRRVEEQQRREREQRK 

ENKSNSEALKQQQQLQQQQQRDPEAHIKHLLHQRQRRIEEQKEERRRVEEQQRREREQRK 

ENKSNSEALKQQQQLQQQQQRDPEAHIECHLLHQRQRRIEEQKEERRRVEEQQRREREQRK 

ENKSNSEALKQQQQLQQQQQRDPEAHIKHLLHQRQRRIEEQiOEERRRVEEQQRREREQRK 

ENKSNSEALKQQQQLQQQQQRDPEAHIKHLLHQRQRRIEEQKEERRRVEEQQRREREQRK 
************************************************************ 

LQEBCEQQRRLEDMQALRREEERRQAEREQEYKRKQLEE 

LQEKEQQRRLEDMQALRREEERRQAEREQEYIRHRLEE 

LQEKEQQRRLEDMQALRREEERRQAEREQEYIRHRLEEEQRQLEILQQQLLQEQALLLEY 

LQEKEQQRRLEDMQALRREEERRQAEREQEYIRHRLEE 

LQEECEQQRRLEDMQALRREEERRQAEREQEYIRHRLEEEQRQLEILQQQLLQEQALLLEY 
******************************* *. ,*** 
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STE20 
N0V3b 
N0V3a 
N0V3d . 
N0V3C 



QRQSERLQRQLQQEHAYLKSLQQQQQQQQLQKQQQQQLLPGDRKPLYHYGRGM 

QRQSERLQRQLQQEHAYLKSLQQQQQQQQLQKQQQQQLLPGDRKPLYHYGRGM 

KRKQLEEQRQSERLQRQLQQEHAYLKSLQQQQQQQQLQKQQQQQLLPGDRKPLYHYGRGM 

QRQSERIiQRQLQQEHAYLKSLQQQQQQQQLQKQQQQQLLPGDRKPLYHYGRGM 

PCRKQLEEQRQSERLQRQLQQEHAYLKSLQQQQQQQQLQKQQQQQLLPGDRKPLYHYGRGM 
*************************** *************************^t 



45 



50 
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STE20 
N0V3b 
N0V3a 
N0V3d 
N0V3C 



STE20 
N0V3b 
lsI0V3a 
N0V3d 
N0V3c 



NPADKPAWTVREVEERTRMNKQQNSPLAKSKPGSTGPEPPIPQASPGPPGPLSQTPPMQRP 
NPADKPAWAREVEERTRMNKQQNSPLAKSKPGSTGPEPPIPQASPGPPGPLSQTPPMQRP 
NPADKPAWAREVEERTRMNKQQNSPLAKSKPGSTGPEPPIPQASPGPPGPLSQTPPMQRP 
NPADKPAWAREV 

NPADKPAWAREV 

************ 

VEPQEGPHKSLVAHRVPLKPYAAPVPRSQSLQDQPTRNLAAFPASHDPDPAIPAPTATPS 

VEPQEGPHKSIiVAHRVPLKPYAAPVPRSQSLQDQPTRNLAAFPASHDPDPAIPAPTATPS 

VEPQEGPHKSLVAHRVPLKPYAAPVPRSQSLQDQPTRNIiAAFPASHDPDPAIPAPTATPS 

VAHRVPLKPYAAPVPRSQSLQDQPTRNLAAFPASHDPDPAIPAPTATPS 

VAHRVPLKPYAAPVPRSQSLQDQPTRNLAAFPASHDPDPAIPAPTATPS 

************************************************* 



60 



STE20 
N0V3b 
N0V3a 
N0V3d 
N0V3C 



ARGAVIRQNSDPTSEGPGPSPNPPAWVRPDNEAPPBCVPQRTSSIATALNTSGAGGSRPAQ 

ARGAVIRQNSDPTSEGPGPSPNPPAWVRPDNEAPPKVPQRTSSIATALNTSGAGGSRPAQ 

ARGAVIRQNSDPTSEGPGPSPNPPAWVRPDNEAPPKVPQRTSSIATALNTSGAGGSRPAQ 

ARGAVIRQNSDPTSEGPGPSPNPPAWVRPDNEAPPKVPQRTSSIATALNTSGAGGSRPAQ 

ARGAVIRQNSDPTSEGPGPSPNPPAWVRPDNEAPPKVPQRTSSIATTILNTSGAGGSRPAQ 
************************************************************ 
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STE20 
N0V3b 
N0V3a 
N0V3d 
NOVSc 



AVRASNPDLRRSDPGWERSDSVLPASHGHLPQAGSIiERNRVGVSSKPDSSPVLSPGNKAK 

AVRASNPDLRRSDPGWERSDSVLPASHGHLPQAGSLERNRVGVSSKPDSSPVLSPGNKAK 

AVRASNPDLRRSDPGWERSDSVLPASHGHLPQAGSLERNRVGVSSKPDSSPVLSPGNKAK 

AVRASNPDLRRSDPGWERSDSVLPASHGHLPQAGSLERNRVGVSSKPDSSPVLSPGNKAK 

AVRASNPDLRRSDPGWERSDSVLPASHGHLPQAGSLERNRVGVSSKPDSSPVLSPGNKAK 
***************** ************* 



10 



15 



20 



STE20 
NOVSb 
N0V3a 
N0V3d 
N0V3c 



STE20 
NOV3b 
N0V3a 
N0V3d 
N0V3c 



PDDHRSRPGRPA DFVLLKERTLDEAPRPPKKAMDYSSSSEEVESSEDDEEEG 

PDDHRSRPGRPASYKRAIGEDFVLLKERTLDEAPRPPKKAMDYSSSSEEVESSEDDEEEG 

PDDHRSRPGRPASYKRAIGEDFVLLKERTLDEAPRPPKBCAMDYSSSSEEVESSEDDEEEG 

PDDHRSRPGRPASYKRAIGEDFVLLKERTLDEAPRPPBCBCAMDYSSSSEEVESSEDDEEEG 

PDDHRSRPGRPASYKRAIGEDFVLLKERTLDEAPRPPKKAMDYSSSSEEVESSEDDEEEG 
************ **************************************** 

EGGPAEGSRDTPGGRSDGDTDSVSTMWHDVEEITGTQPPYGGGTMWQRTPEEERNLLH 

EGGPAEGSRDTPGGRSDGDTDSVSTMWHDVEEITGTQPPYGGGTMWQRTPEEERNLLH 

EGGPAEGSRDTPGGRSDGDTDSVSTMWHDVEEITGTQPPYGGGTMWQRTPEEERNLLH 

EGGPAEGSRDTPGGEISDGDTDSVSTMWHDVEEITGTQPPYGGGTMWQRTPEEERNLLH 

EGGPAEGSRDTPGGRSDGDTDSVSTMWHDVEEITGTQPPYGGGTMWQRTPEEERNLLH 
****************************************^jt********** ******** 
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30 



35 



STE20 
N0V3b 
N0V3a 
N0V3d 
N0V3C 



STE20 
N0V3b 
N0V3a 
N0V3d 
N0V3C 



ADSNGYTNLPDWQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIY 
ADSNGYTNLPDWQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMEVDLGIY 
ADSNGYTNLPDWQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIY 
ADSNGYTNLPDWQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIY 
ADSNGYTNLPDWQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIY 
********************************************** 4,**^^>jt******* 

QPGGSGDSIPITALVGGEGTRLDQLQYDVRKGSWNVNPTNTRAHSETPEIRKYKKRFNS 

QPGGSGDSIPITALVGGEGTRLDQLQYDVRKGSWNVNPTNTRAHSETPEIRKYKKRFNS 

QPGGSGDSIPITALVGGEGTRLDQLQYDVRKGSVVNVNPTNTRAHSETPEIRKYKKRFNS 

-QPGGSGDSIPITALVGGEGTRLDQLQYDVRKGSWNVNPTNTRAHSETPEIRKYKKRFNS 

QPGGSGDSIPITALVGGEGTEOiDQLQYDVRKGSWNVNPTNTRAHSETPEIRKYKKRFNS 
************************************************************ 



40 



STE20 
N0V3b 
N0V3a 
N0V3d 
N0V3C 



EILCAALWGWLLVGTENGLMLLDRSGQGKVYGLIGRRRFQQMDVLEGLNLLITISGBO^ 

EILCAALWGVNLLVGTENGLMLLDRSGQGKVYGLIGRRRFQQMDVLEGLNLLITISGBOEIN 

EILCAALWGVNLLVGTENGLMLLDRSGQGKVYGLIGRRRFQQMDVLEGLNLLITISGEC^ 

EILCAALWGVNLLVGTENGLMLLDRSGQGKVYGLIGRRRFQQMDVLEGLNLLITISGKRN 

EILCAALWGVNLLVGTENGLMLLDRSGQGKVYGLIGRRRFQQMDVLEGLNLLITISGKRN 
************************************************************ 



45 



50 



55 



STE20 
N0V3b 
N0V3a 
N0V3d 
N0V3c 



STE20 
N0V3b 
N0V3a 
N0V3d 
N0V3C 



KLRVYYLSWLRNKILHNDPEVEKKQGWTTVGDMEGCGHYRWKYERIKFLVIALKSSVEV 

KLRVYYLSWLRNKILHNDPEVEKKQGWTTVGDMEGCGHYRVVKYERIKFLVIALKSSVEV 

KLRVYYLSWLRNKILHNDPEVEKKQGWTTVGDMEGCGHYRVVKYERIKFLVIALKSSVEV 

KLRVYYLSWLRNKILHNDPEVEBCKQGWTTVGDMEGCGHYRWECYERIKFLVIALKSSVEV 

KLRVYYLSWLRNKILHNDPEVEKKQGWTTVGDMEGCGHYRVVKYERIKFLVIALKSSVEV 
************************************************************ 

YAWAPKPYHKFMAFKSFADLPHRPLLVDLTVEEGQRLKVIYGSSAGFHAVDVDSGNSYDI 

YAWAPKPYHKFMAFKSFADLPHRPLLVDLTVEEGQRLKVIYGSSAGFHAVDVDSGNSYDI 

YAWAPKPYHKFMAFKSFADLPHRPLLVDLTVEEGQRLKVIYGSSAGFHAVDVDSGNSYDI 

YAWAPKPYHKFMAFKSFADLPHRPLLVDLTVEEGQRLKVIYGSSAGFHAVDVDSGNSYDI 

YAWAPKPYHKFMAFKSFADLPHRPLLVDLTVEEGQRLKVIYGSSAGFHAVDVDSGNSYDI 
************************************************************ 



60 



STE20 
N0V3b 
N0V3a 
N0V3d 
N0V3c 



YIPVHIQSQITPHAIIFLPNTDGMEMLLCYEDEGVYVNTYGRIIECDWLQWGEMPTSVAY 

YIPVHIQSQITPHAIIPLPNTDGMEMLLCYEDEGVYVNTYGRIIKDWLQWGEMPTSVAY 

YIPVHIQSQITPHAIIFLPNTDGMEMI.I.CYEDEGVYVNTYGRIIKDWLQWGEMPTSVAY 

YIPVHIQSQITPHAIIFLPNTDGMEMLLCYEDEGVYVNTYGRIIBCDWLQWGEMPTSVAY 

YI PVHIQSQIT PHAI I FLPNT DGMEMLLCYEDEGVYVNT YGRI IKDWLQWGEMPTSVAY 
************************************************************ 
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5 



STE20 
N0V3b 
N0V3a 
N0V3d 
NOVSc 



ICSNQIMGWGEKAIEIRSVETGHLDGVFMHKRAQRLKFXCERNDKVFFAS VRSGGSSQVY 

ICSNQIMGWGEKAIEIRSVETGHLDGVEmKRAQRLKFLCERNDKVFFASVRSGGSSQV^ 

ICSNQIMGWGEKMEIRSVETGHLDGVEMlKRAQEaiKFLCERNDK\^FASVRSGGS 

ICSNQIMGWGEKMEIRSVETGHLDGVFMHKRAQRLKFLCERNDKVFFASVRSGGSSQVY 

ICSNQIMGWGEKAIEIRSVETGHLDGVFMHKRAQRLKFLCERNDKVFFASVRSGGSSQVY 

****************************^r**** ********* ****************** 



10 



STE20 
NOVBb 
N0V3a 
N0V3d 
N0V3c 



FMTLNRNCIMNW (SEQ ID NO: 39) 

FMTLNRKCIMNW (SEQ ID NO: 9) 

FMTLNRNCIMNW (SEQ ID NO: 7) 

FMTLNRNCIMNW (SEQ ID NO: 13) 

FMTLNRNCIMNW (SEQ ID NO: 11) 



15 



Consensus key 
* - single, fully oooaserved residue 
: - conservation of strong gcoi^ 
. - conservation of weak groups 
- no consensus 



20 



Based on the relatedness betweeoa NOV-3 and STE20 kinases, the disclosed NOV3 
. proteins are novel members of the STE20 protein kinase family. Therefore, the nucleic acids 
and proteins of the inventions are usefhl in potential ther^eutic applications in5)licated in 
various pathologies and disorders described and other pathologies and disorders related to 
25 aberrant function or aberrant expression of tihese STE20-protein kmases. 

Potential tfaenqpeutic uses for the nucleic acids and proteins of the invention include, by 
way of nonlimiting example, protein ther^eutic, small molecule drag target, antibody target 
(including therapeutic, diagnostic, or drag targeting/cytotoxic antibodies), diagnostic and/or 
prognostic marker, gene ther^y (gene delivery/gene ablation), research tools, and tissue 
30 regeneration in vitro and in vivo (regeneration for all these tissues and cell types conq)osing 
these tissues and cell types derived ftom these tissues). 



applications implicated in various Jiames of pathologies/disorders described above, as well as 
other pathologies or disorders. For example, a cDNA encoding the STE20 protein kinase-like 

35 protein may be useful in gene therapy, and the STE20 protein kinase-like protein may be 
useful when administered to a subject in need thereof By way of nonlimiting example, tihe 
con^sitions of the present invention will have efficacy for treatment of patients suffering 
fiom the pathologies described above. The novel nucleic acids encoding the STE20 protein 
kinase-like proteins, and the STE20 protem kinase-like proteins of the invention, or fragments 

40 thereof, may further be useful in diagnostic ^plications, wherein the presence or amount of 
the nucleic acid or the protein are to be assessed. These materials are further useful in the 
generation of antibodies that bind immunospecifically to the novel substances of the invention 
for use in ther^eutic or diagnostic methods. 



The nucleic acids and proteins of the invention are xisefiil in potential ther25)eutic 
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NOV-4: A Novel Trypsin Inhibitor-like protein 

The NOV-4 sequences (N0V-4a, NOV-4b, N0V-4c, NOV-4d, and N0V-4e) 
according to the invention are nucleotide sequences encoding respective polypeptides related 
5 to trypsin inhibitor proteins. 

The disclosed NOV-4 sequences are splice variants. Splice variants occur naturally. 
When a variant and the original sequence have flie same or opposite activity, they may differ 
in various properties not directiy connected to biological activity. A certain variant may be 
e^ressed mainly in one tissue, while the original sequence fiom which it has been varied, or 

1 0 another variant derived £rom the same sequence, may be expressed mainly in another tissue. 
The presence or level of specific splice variants may be the cause, and/or indicative of, a 
disease, disorder, pathological or normal condition. 

Because a drug may be effective against one variant but not another, or may cause side 
effects because it targets all spUce variants, an effective drug needs to target the particidar 

15 spUce variant Because soluble variants with therapeutic or disease-related fimctions may be 
naturally occurring in specific tissues, tiiey may be optimal candidates for drug targets or 
protein therapeutics. Variants may have no activity at all and may serve as dominant negative 
natural inhibitors. Thus, splice variants usefiil ia generating new dmg targets, prdtein 
therapeutics and markers for diagnostics. 

20 NOV-4 sequences according to the invention encode polypeptides related to trypsin 

inhibitor proteins that are e3q>ressed in brain tumors, polypeptides related to sperm coat 
glycoproteins, and polypeptides related to glioma pathogenesis related proteins. See 
Yamakawa et al., 1998, Biochim Biophys Acta 1395(2):202-8; Murphy et al., 1995, Gene 
159(1): 131-5. In addition, similarities were foimd between NOV-4 and insect allergens in 

25 wasps, homets, fire ants, and secreted/membrane proteins in nematode pathogens. See J 
Allergy Clin hnmunol 1990, 85(6):988-96. Therefore, the nucleic acids and proteins of the 
NOV-4 splice variants described in this invention can have similar fimctions as these proteins. 

NOV-4 proteins are e}q)ressed in the following tissues: pituitary gland, mammary 
gland, adrenal gland, thalamus, and fetal lung. 

30 Functional roles attributed to trypsin inhibitor proteins include sperm coat maturation, 

immunological responses, glioma pathogenesis, and signal transduction pathways. Thus, 
NOV-4 nucleic acids and polypeptides, antibodies and related compounds according to the 
invention will be usefiil in ther£q)eutic and diagnostic applications in disorders associated with, 
e,g„ reproductive disorders, immunological disorders, cancer, and metabolic disorders. 
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Additional utilities for NOV-4 nucleic acids and polypeptides according to the invention are 
disclosed herein. 



NOV-4a 

5 A N0V-4a sequence according to the invention is a nucleic acid sequence that encodes 

a polypeptide related to trypsin inhibitor proteins. A disclosed N0V-4a nucleic acid and its 
encoded polypeptide is included in Table 25. The disclosed nucleic acid (SEQ ID NO: 14) is 
2305 nucleotides in length and contains an open reading jBrame (ORF) tiiat begms with an 
ATG initiation codon at nucleotide 453, and ends with a TGA stop codon at nucleotide 1602. 
10 A disclosed, representative ORF encodes a 383 amino acid polypeptide (SEQ ID NO: 15). 
N0V-4a is missing one exon in the 5' nucleotide region compared to other spKce variants 
(N0V-4b and N0V-4c), resulting in an altOTiative methionine start codon and a Kozak 
sequence. 

15 TABLE 25 

CTCTGACTGCTCCTATTGAGCTGTCTGCn'CGCTGTGCCCGCTGTGCCTGCTGTGCCCGCG 
CTGTCGCCGCTGCTACCGCGTCTACTGGACGCGGGAGACGCCAGCGAGCTGGTGATTGGA 
GCCCTGCGGAGAGCTCAAGCGCCCAGCTCTGCCCGAGGAGCCCAGGCTGCCCCGTGAGTC 
CCATAGTTGCTACAGGAGTGGAGCCATGAGCTGCGTCCTGGGTGGTGTCATCCCCTTGGG 

20 GCTGCTGTTCCTGGTCCGCGGATCCCAAGGCTACCTCCTGCCCAACGTCACTCTCTTAGA 
GGAGCTGCTO^GCAAATACCAGCACaACGAGTCTCACTCCCGGGTCre^ 
CAGGGAGGACAAGGAGGAGATCCTCATGCTGCACAACaAGCTTCGGGGCCAGGTG 
TCAGGCCTCCyVACATGGAGTACATGACCTGGGATGACGAACTGGGGCAGGTATCGCTCTC 
CGGGGTTCCATGTGCAGTCCTGGTATGACGAGGTGAAGGACTACACCTACCCCTACCCGA 

25 GCGAGTGCAACCCCTGGTGTCCA6AGAGGTGCTCGGGGCCTATGTGCACGCACTACACAC 
AGATAGTTTGGGCCACCACCAACAAGATCGGTTGTGCTGTGAACACCTGCCGGAAGATGA 
CTGTCTGGGGAGAAGTTTGGGAGAACGCGGTCTACTTTGTCTGCAATTATTCTCCAAAGG 
GGAACTGGATTGGAGAAGCCCCCTAOiAGAATGGCCGGCCCTGCTCTGAGTGCCCACCCA 
GCTATGGAGGCAGCTGCAGGAAGAACTTGTGTTACCGAGAAGAAACCTACACTCCAAAAC 

30 CTGAAACGGACGAGATGAATGAGGTGGaAACGGCTCCCATTCCTGAAGAAAACCATGTTT 
GGCTCCAACCGAGGGTGATGAGACCCACCAAGCCCAAGAAAACCTCTGCGGTCAACTACA 
TGACCCAAGTCGTCAGATGTGACACCAAGATGAAGGACAGGTGCAAAGGGTCCACGTGTA 
ACAGGTACCaVGTGCCCAGCAGGCTGCCTGAACCACyiAGGCGAAGATCTTTGGAACTCTGT 
TCTATGAAA6CTCGTCTAGCATATGCCGCGCCGCCATCC7VCTACGG6ATCCTGGATGACA 

35 AGGGAGGCCTGGTGGATATGACCAGGAACGGGAAGGTCCCCTTCTTOSTGAAGTCTGAGA 
6ACACGGCGTGCAGTCCCTCAGCAAATACAAACCTTCCAGCTCATTCATGGT6TCAAM 
TGAAAGTGCAGGATTTGGACTGCTACACGACCGTTGCTCAGCTGTGCCCGTTTGAAAAGC 
CAGCAACTCACTGCCCAAGAATCCATTGTCCGGCACACTGCAAAGACGAACCrTCCT 
GGGCTCCGGTGTTTGGAACCAACATCTATGCAGATACCTCAAGCATCTGCAAGACAGCCG 

40 

TGCACGCGGGAGTCATCAGCAACGAGAGTGGGGGTGACGTGGACGTGATGCCCGTGGATA 
AAAAGAAGACCTACGTGGGCTCGCTCAGGAATGGAGTTCAGTCTGAAAGCCT6GGGACTC 
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CTCGGGATGGAAAGGCCTTCCGGATCTTTGCTGTCAGGCAGTGAATTTCCyVGCAC^ 
GAGIAGGGGCGTarrCAGGPJSGGCirra 

GGGGTATATGGAGAGTCAGGAAACTTCCrTTGACTGATGTTaWSTGTCCATm 
GCCTGTGGGTGAGGTGACATCTCATCCCCTCACrGAAGCflACaGCATCCCAAGGTGCTCa 
5 GCCGGACTCCCTGGTGCCTGATCCTGCTGGGGCCOSGGGGTCTCCATCTGGACGTCCTCT 
CTCCTTTAGAGATCTGZVGCTGTCTCPTJUUW^GGGACaGTTGCC 
. TGTGTTCTTCTGTTGGTGGAGGJU^TTGATTTCaVACCTCCCTGCCAAAAGAAC^ 
TTGAAGCTCACSiATTGTGAAGCATTCACGGCGTa^GAAGAGGCCTT^ 
ATGAGTTTGJVGGa^TGAAGTAGAAGGTAGTTATTTAAAAATAAAAAACaC^ 
1 0 TACCaATAGAGGAAAATGGTTTTAATGTTTGCTGGTCaGACaGACAAATGGGCTAGAGTA 
AGAGGGCT6CGGGTATGAGAGACCCCGGCTCCGCCCTGGCACGTGTCCTTGCTGGCGGCC 
CGCCACAGGCCCCCTTCAATGGCCGCATTCAG6ATGGCTCTATACACAGCAGTGCTGGTT 
TATGTAAAGTTCAGCAGTCACTTCA (SEQ ID NO: 14) 

OTNWGRYRSPGFHVQSWYDEVKDYTYPYPSECKPWCPERCSGPMCTHyTQIVWATTN 
KIGCAVNTCRKMTVWGEVWENAVYFVCNYSPKGmiGEAPYKNGRPCSECPPSYG^ 
CRNNLCYREETYTPKPETDEMNEVETl^IPEENHVWLQPRVMRPTKPKKTSAVN^ 
QVVRCDTKMKDRCKGSTCNRYQCPAGCIJJJHKAKIFGTLFyESSSSICRAAIHYG 
DKGGLVDITRNGKVPFFVKSERHGVQSLSKYKPSSSFIWSK\^QDLDCOTTVAQLC 
PFEKPATHCPRIHCPAHCKDEPSYWAPVFGTNIYADTSSICKTAVHA6VISNESGGD 
VDVMPVDKKKTYVGSLRNGVQSESI/3TPRDGKAFRIEAVRQ (SEQ ID NO: 15) 

The disclosed N0V-4a amino acid sequence has a higji level of homology (99% identity, 
99% similarity) to a human trypsin inhibitor-like protein (GenBank Accession No: 
25 CAB66795), shown in Table 26. Moreover, the N0V-4a amino acid sequence has homology 
(72% identity, 82% similarity) to a known human trypsin inhibitor (TREMBL ACC No: 
043692), also shown in Table 26. As indicated by the **Expect" values, the probability of 
these alignments occurring by chance alone is 0.0 and 5.3e-51, respectively. 

TABLE 26 

Score = 786 bits (2031), Expect =0.0 
Identities = 380/381 (99%), Positives = 381/381 (99%) 

N0V4a: 3 NWGRYRSPGFHVQSWyDBVKDYTYPyPSEONPWCPERCSGEMCTHYTQrvWATTNKIGC^ 62 

+WGRYRSPGPHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHYTQIVWATTNKIGCA 
TRYP : 117 HWGRYRSPXSFHVQSWYDEVKDYTYPyPSECNPWCPERCSGPMCTHYTQIVWATTNKIGCA 176 

N0V4a: 63 VNTCRKMTVWGEVWENAVYFVCNYSPKGNWIGEAPYKNGRPCSECPPSYGGSCRNNLCTO 122 

VNTCRKMTVWGEVWENAVYFVCNYSPKGNWIGEAPYKNGRPCSECPPSYGGSCRNNLCYR 
TRYP : 177 VNTCRKMTVWGEVWENAVYFVCNYSPKGNWIGEAPYKNGRPCSECPPSYGGSCRNNLCYR 236 

N0V4a: 123 EETYTPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKKTSAVNYOTQVVRCDTKMKD 182 

EETYTPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPBCKTSAVNYMTQVVR 
TRYP : 237 EETYTPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKKTSAVNYMTQVVRCDTKMKD 296 

N0V4a: 183 RCKGSTCNRYQCPAGCLNHKAKIFGTLFYESSSSICRAAIHYGILDDKGGLVDITRNGKV 242 

RCKGSTCNRYQCPAGCIiNHKAKIFGTLFYESSSSICRAAIHYGILDDKGGLVDITRNGKV 
TRYP : 297 RCKGSTGNRYQCPAGCLNHKAKIFGTLFYESSSSICRAAIHYGILDDKGGLVDITRNGKV 356 
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N0V4a: 243 PFFVKSERHGVQSLSKYKPSSSEMVSKVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPAH 302 

PFFVKSERHGVQSLSKraPSSSFMVSKVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPAH 
TRYP : 357 PFFVKSERHGVQSLSKYKPSSSFMVSKVKVQDLDCYTWAQLCPFEKPATHCPRIHCPA^ 416 



5 



N0V4a: 303 CKDEPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKKKTYVGSLR^ 362 

CKDEPSWAPVFGTNIYADTSSICOTAVHAGVISNESGGbVDVMPVDKKKTYVGSI^ 
TRYP : 417 CKDEPSYWAPVFGTNIYADTSSICKTAVHAG\rESNESGGDVDVMPVDKKKTYVGSLra 476 



10 



N0V4a: 363 QSESIiGTPRDGKAFRlFAVRQ 383 (SEQ ID NO: 72) 

QSESLGTPRDGKAFRIFAVRQ 
TRYP : 477 QSESLGTPRDGKAFRIFAVRQ 497 (SEQ ID NO: 40) 



15 



• Score = 530 (186.6 bits), Eaqpect 5.3e-51, P = 5.3e-51 
Identities o 85/117 (72%), Positives « 97/117 (82%) 



N0V4a: 



5 GRYRSPGPHVQSWYDEVKDYTYPYPSEOIPWCPERCSGPMCTHYTQrVWATTNKIGra^ 64 
GRYRS V+ WYDEVKDY +PYP +CNP CP RC GPMCTHyTQ+VWAT+N+IGCA++ 
130 GRYRSILQLVKPWYDEVKDYAFPYPQDCailPRCPMRCFGPMCTHyTQMVWATSNRI 189 



TRYP 



20 



N0V4a: 65 TCRKMTVWGEVWENAVYFVOTySPKGNWIGEAPYKNGRPCSECPPSYGGSCaai^^ 121 (SEQ 
ID NO: 73) 

TC+ M VWa VW AVY VCNY+PK)C»IWIGEAPYK G PCS CPPSYGGSC +NI#C+ 
TRYP : 190 TCQNMNVWGSVWRRAVYLVCNYAPKCaWIQEAPYKVGVPCSSCPPSYGGSCTDNL 246 (SEQ 
25 ID NO: 41) 

Furthermore, a PROSITB database search of proteiii families and domains confirmed 
that a NOV-4a polypeptide is a member of the trypsin inhibitor family. One of the conserved 
regions fomid in trypsin inhibitors is a SCP domain, located at the C-terminal half The 
30 pattern of this conserved domam is: [LIVMFYH].[LIVMFY]-x-C-[NQRHS]-Y-x-P 

[GL1-N-CLIVMFYWDN] (SEQ ID NO: 56). This pattern is found in amino acids 81-92 of 
SEQ ID NO: 15. 

PSORT analyses indicate that that N0V-4a is likely located in the nucleus (certainty = 
0.3000). The predicted molecular wdght of N0V-4a is 43185.7 daltons. 

35 Based on its relatedness to known members of the trypsin inhibitor family of proteins, 

N0V4a provides new diagnostic and therapeutic compositions useful in the treatment of 
disorders associated with alterations in the expression of members of the trypsin inhibitor 
protein fanwly. Nucleic acids, polypeptides, antibodies, and other compositions of the present 
invention are useful in the treatment and diagnosis of a variety of diseases and pathologies, 

40 including, by way of nonlimiting example, those involviag reproductive disorders, 
immunological disorders, cancer, and metabolic disorders. 



A disclosed NOV-4b sequence according to the invention is a nucleic acid sequence 
45 that encodes a polypeptide related to trypsin inhibitor proteins. A disclosed N0V-4b nucleic 
acid and its encoded polypeptide are included in Table 27. The disclosed nucleic acid (SEQ 
ID NO: 16) is 2400 nucleotides in length and contains an open reading frame (OKF) that 



NOV-4b 
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begins with an ATG initiatioii codon at nucleotide 205, and ends with a TGA stop codon at 
nucleotide 1697. A disclosed, reptesentadve ORF encodes a 497 andno acid polypeptide 
(SEQIDNO: 17). 

5 TABLE 27 

CTCTGACrrGCTCCTATTGAGCTGTCTGCTCGCTGTGCCCGCTGTGCCn'GCTGTGCCCGCGCTC^ 
TCTACTGGACGCGGGAGACGCCyiGCGAGCTGGTGATTGGAGCCCrGCGGaGAGCrPa^^ 

CCCAGGCT6CCCCGTGAGTCCCATAGTTGCTGa\GGAGTGGAGCCATGAGCTGCGTCCTGGGTGGTGTCATCCCCT 

GCTGCTGTTCCTGGTCCGCGGATCCCaAGGCrACCTCCTGCCCAACGTCaCT 
1 0 AGCACAACGAGTCrCACTCCCGGGTCCGCAGaGCCATCCCCAGGGAGGACAAGGAGG 

CTTCGGGGCCaGGTGCAGCCTCAGGCCTCCAACATGGAGTACATGACCTGGGATGACG^ 

GTGGGCCAGTCAGTGCATCTGGGAGCACGGGCCCACCGGTCTGCTGGTGTCCATCGGGCaGAACCT 

GCAGGTATCGCTCTCCGGGGTTCCyVTGTGCAGTCCTGGTATGACGAGGTGAAGGACrACACCT 

TGCftACCCCPGGTGTCCAGAGAGGTGCTCGGGGCCTATGTGCACGCACTACaCACAGAT 
1 5 GATCGGTTGTGCTGTGAACACCTGCCGGAAGATGACTGTCTGGGGAGEAAGTTTGG^ 

ATTATTCTCCATiAGGGGAACTGGATTGGAGAAGCCCCCTACAAGAATGGCCGGCCCTGCT 

GGAGGCAGCTGCAGGAACAACTTGTGTTACCGAGAAGAAACCrACaCTCC^^ 

GGAAACGGCTCCCATTCCTGAAGAAAACCATGTTTGGCrCCAACCGAGGGTGATGAGACCCACa^ 

CTTGCGGTCAACTACATGACCCa^TCGTaUSATGTGACACCAAG^^ 
20 TACCaGTGCCCAGCAGGCTGCCTGAACCACAAGGCGAaGATCTTTGGAAGTCT 

CCGCGCCGCOVTCCACTACGGGATCCTGGATGACAAGGGAGGCCrGGTGGATATaiCC^^ 

TCGTGAAGTCTGAGAGACACGGCGTGCAGTCCCTCAGCaAATAauUlCCT^ 

GTGCAGGATTTGGACTGCTACACGACCGTTGCTCAGCTGTGCCCGTTTGAAAAGCC^ 

TTGTCCGGOVCACPGCAAAGACGAACCTTCCTACTGGGCTCCGGTGim'GGAACC^ 
25 TCTGCAAGACAGCCGTGCACGCGGGAGTCATCAGCAACGAGAGTGGGGGTGAa^ 

AAGACCTACGTGGGCTCGCTCAGGAATGGAGTTCAGTCTGTVAAGCCTGGGGACTCCT^ 

CTTTGCTGTCAGGCAGtGAATTTCCAGCACCAGGGGAGAAGGGGCGTCrTCAGGAGGGCTTCGGG^ 

TTATTTTGTCATTGCGGGGTATATGGAGAGTCAGGAAACTTCOTTTGACTGATGTTCAGTGTCCATO^CTO 

TGGGTGAGGTGACATCTCATCCCCTCaVCTGAAGCaUiCAGCATCCa^A^ 
30 GCTGGGGCCCGGGGGTCTCCATCTGGACGTCCrrCTCrPCCTTTAGAGATCTGAGCTGTCTCTTAATVGGGGACA^ 

AAATGTTCCTTGCTATGTGTTCTTCTGTTGGTGGAGGAAGTTGATTTCAACCTCCCTGCC^ 

GCTCACAATTGTGAAGCATTCACGGCGTCGGAAGAGGCCTTTTGAGCM 

GTAGTTATTTAAAAATAAAAAACACy\GTCCGTCCCTACCAATAGAGGAAAATGGTTTTAATGT^ 

AAATGGGCTAGAGTAAGAGGGCTGCGGGTATGAGAGACCCCGGCTa^GCCCTaXACGTGTCCCT 
35 CaGGCCCCCTTCAATGGCCGCATTCAGGATGGCTCTATACACAGCAGTGCTGGT^ 

(SEQIDNO: 16) 



MSCVLGGVIPLGLLFLVRGSQGYLLPNVTLLEELLSKYQHNESHSRVRRAIPREDKEEIIMLH^ 
EYMTWDDELEKSAAAWASQCIWEHGPTGLLVSIGQNLGAHWGRYRSPGFHVQSWYDEVKD^^ 
40 SGPMCTHYTQIVWATTNKIGCAVNTCRraOTWGEVWENAVYFVCNYSPKGNWIGEAP^ 
NLCYREETYTPKPETDEMNEN^TAPIPEENHVWLQPRVMRPTKPKKTSAVNYMTQVVRCDT 
PAGCLNHKAKIFXSSLETESSSSICRAAIHYGILDDKGGLVDITRNGKVPFEVKSERHGVQSLSKYKPSSSFM^ 
VQDLDOTTVAQLCPFEKPATHCPRIHCPAHCKDEPSYWAPVFGTNIYADTSSICKTAVHAGVISN^ 
DKKKTYVGSLRNGVQSESIiGTPRDGKAFRIFAVRQ .(SEQIDNO: 17) * 

^ ■ 83^.—-- ^ - 
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The disclosed N0V-4b amino add sequence has 124 of 191 amino acid residues (64%) 
identical to, and 148 of 191 (77%) similar to, a known human trypsin inhibitor (TREMBL 
ACC No: 043692), as shown in Table 28. As indicated by flie **Expect" value, the probability 
5 of this aligmnent occurring by chance alone is 6.1e"73, which is a very low probability score. 

TABLE 28 

Score e 737 {259.4 bits). Expect = 6.1e-73, P = 6.1e-73 
Identities = 124/191 (64%) , Positives = 148/191 (77%) 

N0V4b: 45 SRVRRAIPREDKEEILMLHOTOiRGQVQPOASmEYMTWDDEIiEKBA 104 

+R +R I + D III HN++RG+V P A+NMKXM WD+ I, KSA AWA+ CIW+HGP+ 
TRYP : 56 ARRKRYISQfflDMIAILDYHNQVRGKN^PPAANMEyMVWDENtiAKSAEAW 115 

15 N0V4b: 105 GLLVSIGQNLGAHWGRYRSPGFm^QSWyDEVKDYTYPyPSEOTPWCPERC^ 164 
LL +GQNL GRYRS V+ WYDEVKDY +PYP +CNP CP RC GPMCTHYTQ 
TRYP : 116 YIJjRFIiGQin^STOTGRYRSIIiQLVKPWYDEVKDYJ^^ 175 

N0V4b: 165 IWATTNKIGCAVNTCRKMTVWGBVWEKAVyFVCNYSPK^^ 224 
20 ■♦•VWAT+N+IGC7^++TC+ M VWG VW AVY VCNY+PKGNWIGEAPYK G PCS CPPS 

TRYP : 176 MVWATSNRIGCAIHTCQNMmWGSVWRRAVYLVCNYAPKGNW^ 235 

NOV4b: 225 YGGSCRNNLCY 235 (SEQ ID NO: 74) 
YGGSC +Kr*C+ 

25 TRYP : 236 YGGSCTDNLCF 246 (SEQ ID NO: 42) 

Furthermore, a PROSITE database search of protein families and domains confirmed 
that NOV-4a is a member of the trypsin inhibitor family. One of the conserved regions fomid 
30 in trypsin inhibitors is a SCP domain, located at the C-tenninal half. The pattern of this 
conserved domain is: [LIVMFYH]-[LrVMFY]-x-C-|NQ^ 

[LIVMFYWDN] (SEQ ID NO: 56). This pattern is found in amino adds 195-206 of SEQ ID 
NO: 17. 

SigaalPep and PSORT analyses indicate that that N0V-4b is likely located outside of 
35 the cell (certainty = 0.6950), and is likely to have a cleavable N-tenninal signal sequence with 
a cleavage site between positions 22 and 23: SQG-YL. The predicted molecular weight of 
N0V-4b is 55928.2 daltons. 

Based on its relatedness to known members of the trypsin inhibitor family of proteins, 
N0V4b provides new diagnostic and ther^eutic compositions useful in the treatment of 
40 disorders associated with alterations in the expression of members of the trypsin inhibitor 

protein family. Nucleic acids, polypeptides, antibodies, and other compositions of the present 
invention are useful in the treatment and diagnosis of a variety of diseases and pathologies, 
including, by way of nonlimiting example, those mvolving reproductive disorders, 
immunological disorders, cancer, and metabolic disorders. 

84 
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NOV-4C 

A N0V-4c sequence according to the invention is a nucleic acid sequence that encodes 
a polypeptide related to trypsin inhibitor proteins. A disclosed N0V-4C nucleic acid and its 
5 encoded polypeptide are included in Table 29. The disclosed nucleic acid (SEQ ID NO: 18) is 
1669 nucleotides in length and contains an open reading frame (ORF) that begins with an 
ATG initiation codon at nucleotide 205, and ends with a TAG stop codon at nucleotide 1636. 
The representative ORF encodes a 205 amino add polypeptide (SEQ ID NO: 19). 

10 TABLE 29 

TCTGACTGCTCCTATTGAGCTGTCTGCTCGCTGTGCCCGCTGTGCCTGCTGTGCCCGCGC 
TGTCGCCGCTGCTACCGCGTCTACTGGACGCGGGAGACGCCAGCGAGCTGGTGATTGGAG 
CCCTGCGGAGAGCTCAAGCGCCCAGCTCTGCCCGAGGAGCCCAGGCTGCCCCGTGAG^ 
CATAGTTGCTGCAGGAGTGGAGCCATGAGCTGCGTCCTGGGTGGTGTCATCCCera 

15 

CTGCTGTTCCTGGTCTGCGGATCCCAAGGCTACCTCCTGCCCAACGTCACTCTCTTAGAG 

GAGCTGCTCAGCAAATACCAGCACAACGAGTCrCACTCCCGGGTCCGCA^ 

AGGGAGGACAAGGAGGAGATCCTCATGCTGCACAACAAGCTTCGGGGCCAGGTGCAGC^ 

CAGGCCTCCAACaTGGAGTACATGACCTGGGATGACGAACTGGAGAAGTCTGCTGCAGCG 

TGGGCCAGTCAGTGCATCTGGGAGCACGGGCCaVCaSGTCTGCTGGTGTCCAT^^ 
20 AACCTGGGCGCTCACTGGGGCAGGTATCGCTCTCCGGGGTTCCATGTGCAGTCCTGGTAT 

GACGAGGTGAAGGACTACACCTACCCCTACC(^GCGAGTGCAACCCCTGGTGTCa^ 

AGGTGCTCGGGGCCTATGTGCACGCACTACACACAGATAGTTTGGGCCACCACCAACAAG 

ATCGGTTGTGCTGTGAACACCTGCCGGAAGATGACTGTCTGGGGAGAAGTTTGGGTiGAAC 

GCGGTCTACTTTGTCTGCAATTATTCTCaUWSGGGAACTGGATTGGAGAAGCCC^ 
25 AAGAATGGCCGGCCCTGCTCTCAGTGCCCACCCAGCTATGGAGGCAGCTGCaGGAACM 

TTGTGTTACCGAGAAGAAACCTACaCTCCAAAACCPGAAACGGACGAGATGAATGAGGTG 

GAAACGGCTCCCATTCCTGAAGAAAACCaTGTTTGGCTCCAACCGAGGGTGATGAGACCC 

ACCMGCCCAAGAAAACCTCTTOSGTCAACTACATGACCCAAGTCGTCTT^^ 

AAGATGAAGGACAGGTGCAAAGGGTCGACGTGTAACAGG^ACCAGTGCCOVGCAGGCTGC 
30 CTGAACCACAAGGCGAAGATCTTTGGT^CTCTGTTCTATGAAAGCTCGTCTAGCATATGC 

CGCGCCGCCTVTCCACTACGGGATCCTGGATGACAAGGGAGGCCTGGTGGATATa^^ 

AACGGGAAGGTCCCCTTCTTCGTGAAGTCTGAGAGACaVCGGCGTGCAGTCCCTCAGC^^ 

TACAAACCTTCCAGCTCATTCyVTGGTGTCTiAAAGTGAAAGTGCA 

AOSACCGTTGCTCAGCTGTGCCCGTTTGAAAAGCCAGCAACTCACTGCCCAAGAATCCAT 
35 TGTCCGGCACACTGCAAAGACGAACCTTCCTACTGGGCTCeGGTGTTTGGAACCAA 

TATGCAGATACCTCAAGCATCTGCAAGACAGCCGTGCACGCGGGAGTCATCAGCAACGAG 
AGTGGGGGTGACGTGGACGTGATGCCCGTGGATAAAAAGAAGACCTACACCTGCCCGGCA 
GCCGCTCGAGCCCTATAGTGTAAACCGATTCGCAGCACACTGGCGCCGT (SEQ ID 

NO: 18) 

40 

MSCVLGGVIPLGLLFLVCGSQGYLLPNVTLLEELLSKYQHNESHSRVRRAIP 
REDBCEEILMLHNKLRGQVQPQASNMEYMTWDDELEKSAAAWASQCIWEHGPT 
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GLLVSIGQNLGAHWGRYRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSG 
PMCTHYTQIVWATTNKIGCAVNTCRKMTVWGEVWENAVYFVCNYSPKGNWIG 
EAPYKNGRPCSQCPPSYGGSCRNNLCYREETYTPKPETDEMNEVETAPIPEE 
NHVWLQPRVMRPTKPKKTSSVNYMTQVVLCDTKMKDRCKGSTCNRYQCPAGC 
5 LNHKAKIFGTLFYESSSSICRAAIHYGILDDKGGLVDITRNGKVPFFVKSER 
HGVQSLSECYKPSSSFMVSKVKVQDIiDCYTTVAQLCPFEKPATHCPRIHCPAH 
CKDEPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKKKTY 
TCPAAARAL (SEQ ID NO: 19) 

10 The disclosed N0V-4c amino acid sequence has a high level of homology (97% 

identity, 97% similarity) to a human trypsin inhibitor-like protein (GenBank Accession No: 
CAB66795), shown in Table 30. As indicated by the '"Expect" value, the probability of this 
alignment occmring by chance alone is 0.0, the lowest probability score. 

15 TABLE 30 

Score 948 bits (2452), Expect = 0.0 

Identities = 458/468 (97%), Positives « 460/468 (97%) 

N0V4c: 1 MSCVLGGVIPLGLLFLVCGSQGYLLPNVTXXXXXXSKYQHNESHSRVRRAIPREDKEEIL 60 
20 MSCVLGGVIPLGLLFLVCGSQGYLLPNVT SKYQHNESHSRVRRAIPREDKEEIL 

TRYP : 1 MSCVLGGVIPLGLLFLVCGSQGYLLPNVTLLEELLSKYQHNESHSRVRRAIPREDKEEIL 60 

N0V4c: 61 MLHNKLRGQVQPQASNMEYMTWDDELEKSAAAWASQCIWEHGPTGLLVSIGQNLGAHWGR 120 
MLHNKLRGQVQPQASNMEYMTWDDELEKSAAAWASQCIWEHGPT LLVSIGQNLGAHWGR 
25 TRYP : 61 MLHNKLRGQVQPQASNMEYMTWDDELEKSAAAWASQCIWEHGPTSLLVSIGQNLGAHWGR 120 

N0V4c: 121 YRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHYTQIVmTTNKIGCAVNTC 180 

YRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHYTQIVWATTNKIGCAVNTC 
TRYP : 121 YRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHYTQIVWATTNKIGCAVNTC 180 

N0V4c: 181 RBCMTVWGEVWENAVYFVCmSPKGNWIGEAPYBCNGRPCSQCPPSYGGSCRNNLCYREETY 240 

RKMTVWGEVWEMAVYFVCNYSPKGNJflGEAPYKNGRPCS+CPPSYGGSCRNNIiCYREETY . 
TRYP : 181 RKMTVWGEVWENAVYFVCNYSPKGNWIGEAPYKNGRPCSECPPSYGGSCRNNLCYREETY 240 

35 N0V4c: 241 TPKPETDEMNEVETAPIPEENHWLQPRVMRPTKPKKTSSVNYMTQVVLCDTE(MKDRCKG 300 
TPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKBCTS+VNYMTQW CDTKMKDRCKG 
TRYP : 241 TPKPETDEMNEVETAPIPEENHVWLQPRVMRPTBCPKKTSAVNYMTQVVRCDTKMKDRCKG 300 

N0V4c: 301 STCNRYQCPAGCLNHKAKIFGTLFYESSSSICRAAIHYGILDDKGGLVDITRNGKVPFFV 360 
40 STCNRYQCPAGCLNHKAKIFGTLFYESSSSICRAAIHYGILDDKGGLVDITRNGKVPFFV 

TRYP : 301 STCNRYQCPAGCLNHKAKIFGTLFYESSSSICRAAIHYGILDDKGGLVDITRNGKVPFFV 360 

N0V4c: 361 KSERHGVQSLSKYKPSSSFMVSKVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPAHCKDE 420 
KSERHGVQSLSKYKPSSSEWVSKVKVQDLDCYTTVAQIiCPFEKPATHCPRIflCPAHCKDE 
45 TRYP : 361 KSERHGVQSLSECYKPSSSFMVSKVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPAHCKDE 420 

N0V4c: 421 PSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKKKTY 468 (SEQ ID NO' 
75) 

PS YWAPVFGTNI YADTS S I CKTAVHAGVI SNESGGDVDVMPVDKKKT Y 
50 TRYP : 421 PSYWAPVFGTNIYADTSSICKTAVHAGVI SNESGGDVDVMPVDKKKT Y 468 (SEQ ID NO: 
43) 



30 



86 
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Furthermore, a PROSITE database search of protein families and domains confirmed that 
NOV-4C is a member of the trypsin inhibitor family. One of the conserved regions foimd in 
trypsin inhibitors is a SCP domain, located at the C-terminal half. The pattern of this 
conserved domain is: IlJVMFYH]-[IJVMFY]-x<:-^ 
5 llJVMFYWDN] (SEQ ID NO: 56). This pattern is found in amino acids 81-92 of SEQ ID 
NO: 19. 

In addition, SignalPep and PSORT analyses indicate that N0V-4c is likely located 
outside of the cell (certainty = 0.8200), and is likely to have a cleavable N-terminal signal 
sequence with a cleavage site between positions 22 and 23: SQG-YL, The pi^cted 

10 molecular weight of N0V-4c is 53587.7 daltons. 

Based on the relatedness between NO V-4c and the conserved trypsin inhibitor 
proteins, the N0V-4c protein is a novel member of Ihe trypsin inhibitor femily, N0V-4c 
provides new diagnostic and therq)eutic compositions useful in the treatment of disorders 
associated with alterations m the expression of members of the trypsin inhibitor piotem 

15 family. Nucleic acids, polypeptides, antibodies, and other compositions of the present 

invention are useful in the treatment and diagnosis of a variety of diseases and pathologies, 
including, by way of nonlimiting example, those involving reproductive disorders, 
immunological disord^, cancer, and metabolic disorders. 



20 NOV^ 

A N0V-4d sequence according to the invention is a nucleic acid sequence that encodes 
a polypeptide related to trypsin inhibitor proteins. A disclosed N0V-4d nucleic acid and its 
encoded polypeptide are included in Table 31. The disclosed nucleic add (SEQ ED NO: 20) is 
2403 nucleotides in length and contains an open reading frame (ORF) that begins with an 

25 ATG initiation codon at nucleotide 206, and ends with a TGA stop codon at nucleotide 1700. 
A disclosed, repr^entative ORF encodes a 498 amino add polypeptide (SEQ ID NO: 21). 

TABLE 31 

CTCTGACTGCTCCTATTGAGCTGTCTGCTCGCTGTGCCCGCTGT6CCTGCTGTGCCCGCGCTGTCGCCGCT 
30 TCTACTGGACGCGGGAGACGCCAGCGAGCTGGTGATTGGAGCCCTGCGGAGAGCTCAAGCGCCCA^ 

CCCAGGCTGCCCCGTGAGTCCCATAGTTGCTGCAGGAGTGGAGCCATGAGCTGCGTCCrrGGGTGGTGTCAT^^ 

GCTGCTGTTCCTGGTCCGCGGATCCCAAGGCTACCTCCTGCCaUVCGTCACTCT 

AGCACAACGAGTCTCACTCCCGGGTCCGCAGAGCCATCCCCAGGGAGGACAAGGAGG^ 

CTTCGGGGCCAGGTGCAGCCTCAGGCCTCCAACATGGAGTACyVTGACCTGGGAT^ 
35 GTGGGCCAGTCAGTGCATCTGGGAGCACGGGCCCACCAGTCTGCTGGTGTCCATCGGGCAGAAC 

GCAGGAGGTATCGCTCTCCGGGGTTCCATGTGraGTCCTGGTATGACGAGGTGAAGGACTAmCCTA 

GAGTGO^CCCCTGGTGTCCAGAGAGGTGCTCGGGGCCTATGTGCACGCACTACACACAGATAGT^ 
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CAAGATCGGTTGTGCTGTGAACACCTGCCGGaAGATGACTGTCTGGGGAGAA^ 
GCAATTATTCTCCAAAGGGGAACTGGATTGGAGAAGCCCCCTACAAGAA 
TATGGAGGCAGCTGCaiGGAACAACTTGTGTTACCGAGAAGAAACCTACACrrC^^ 
GGTGGAAACGGCTCCCaiTTCCTGAAGAAAACCATGTTTGGCTCCaACCGAG^ 
CCTCTGCGGTCaVACTACATGACCCAAGTCGTCAGATGTGACACCaUVGATG^ 
AGGTACCAGTGCCCAGCAGGCTGCCTGAACCACAAGGCGAAGATCTTTGGAAGTCTOT^ 
ATGCCGCGCCGCCATCOlCTACmSATCCTGGATGACMGGGAGGCCTGGTGGATAT^^ 
TCTTCGTGAAGTCTGAGAGACACGGCGTGCAGTCCCTCAGCyVAATACAAACC^ 
AAAGTGCAGGATTTGGACTGCTACaasaCCGTTGCTCaWSCTGTGCCC^^ 
CCATTGTCCGGCAGACTGOUAGACGAACCPTCCTACTGGGCTCCGG^ 
GCATCTGCAAGACAGCCGTGCACGCGGGAGTCATCAGCAAaSAGAGTGGGGGTa 
AAGAAGACCTACGTGGGCTCGCTCAGGAATGGAGTTCAGTCTGAAAGCCTGGG^ 
GATCTTTGCTGTCAGGCAGTGAATTTCCAGCACCAGGGGAGAAGGGGCGTCTTCA^^ 
TTTTTATTTTGTCATTGCGGGGTATATGGAGAGTCAGGAAACTTCCTTTGACTGATGTTCM 
CTGTGGGTGAGGTGACATCTCATCCCCTCACTGAAGGAACAGaVTCCCaA 
CCTGCTGGGGCCCGGGGGTCTCCATCTGGACGTCCTCTCTCCTTTAGAGAT^ 
CCAAAATGTTCCTTGCTATGTGTTCTTCTGTTGGTGGAGGAAGTTGATTTOkACCT^ 
GAAGCTCaCAATTGTGAAGCaTTCACGGCGTCGGMGAGGCCTTTTGAGOW^^ 
AAGGTAGTTATTTAAAAATAAAAAACavCAGTCCGTCCCTACCAATAGAGGAAAATGGTTTTA^^ 
GACaiAATGGGCTAGAGTAAGAGGGCTGCGGGTATGAfiAGACCCCGGCTCCGCCCTGGC^ 
CCACAGGCCCCOTTCAATGGCOKZATTCAGGATGGCTCTATACAC^^ 

TCA (SEQIDNO: 20) 



MSCVLGGVIPLGLLEXVRGSQGYIJCiPNVTLLEELLSKYQHNESHSRVRRAIPRKD^ 
QPQASNMEyMTWDDELEKSAAAWASQCIWEHGPTSLLVSIGQNLGAHWGRRYRSPGFHVQSWYDEVKDYT 
YPYPSECNPWCPERCSGPMCTHYTQIVWATTNKIGCAVNTCRKMTWGEVWENAVY 

apykngrpcsecppsyggscrnnu;yreetytpkpetdemnevetapipeenhvwlqprvmr^ 

AVNYMTQVVRCDTKMKDRCKGSTCNRYQCPAGCLNHKAKIFGSLFYESSSSICRAAIHYGILDDKGGLVD 
ITRNGECVPFFVKSERHGVQSLSKYKPSSSFMVSKVKVQDLDCYTWAQLCPFEKPATHCPRIHCPA^ 

30 epsywapvfgtniyadtssicktavhagvisnesggdvdvmpvdkkktyvgslrngvqseslgtprdgk^ 
frifavrq (seq id no: 21) 

The disclosed NOV-4d amino acid sequence has a high level of homology (98% 
identity, 98% similarity) to a human trypsin inhibitor-Kke protein (GenBank Accession No: 
35 CAB66795), as shown m Table 32. As indicated by the '"Expect" value, the probabihty of this 
aligmnent occurring by chance alone is 0.0, the lowest probability score. 

TABLE 32 

Score = 1007 bits (2605), Expect = 0.0 
40 Identities - 489/498 (98%), Positives - 490/498 (98%), Gaps = 1/498 (0%) 

N0V4d: 1 MSCVLGGVIPLGLLFLVRGSQGYLLPNVTXXXXXXSKYQHNESHSRVRRAIPREDKEEIL 60 

MSCVLGGVIPLGLLFLV GSQGYLLPNVT SKYQHNESHSRVRRAIPREDKEEIL 
TRYP : 1 MSCVLGGVIPLGLLFLVCGSQGYLLPNVTLLEELLSKYQHNESHSRVRRAIPREDKEEIL 60 
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15 



20 



25 



N0V4d: 61 MLHNKLRGQVQPQASNMEYMTWDDELEKSAAAWASQCIWEHGPTSLLVSIGQNLGAHWG 120 
MLHNKLRGQVQPQASNMEYMTWDDELEKSAAAWASQCIWEHGPTSLLVSIGQNLGAHWG 
^ TRYP ; 61 MLHNKLRGQVQPQASN^ffiYMTWDDELEKSAAAWASQCIWEHGPTSIJlVSIGQNLGa^ 119 

N0V4d: 121 RYRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHYTQIVWATTNKIGCAVNT 180 

RYRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHYTQIVWATTNKIGCAVNT 
TRYP : 120 RYRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHYTQIVWATTNKIGCAVNT 179 

10 N0V4d: 181 CRKMTVWGEVWENAVYFVCNYSPKGNWIGEAPYBCNGRPCSECPPSYGGSCRNNLCYREET 240 
CRKMTVWGEVWENAVYFVCNYSPKGNWIGEAPYKNGRPCSECPPSYGGSCRNNLCYREET 
TRYP : 180 CRKMTVWGEVWENAVYFVCNYSPKG^WIGEAPYKNGRPCSECPPSYGGSCRMLCYR^ 239 

N0V4d: 241 YTPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKICTSAVNYMTQVVRCDTKMKDRCK 300 

YTPKPETDEMNEVETAPIPEENHVWLQPRVmPTKPKKTSAVNYMTQVVRCDTKI^DRCK 
TRYP : 240 YTPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKKTSAVNYMTQVVRCDTKMKDRCK 299 

N0V4d: 301 GSTCNRYQCPAGCLNHKAKIFGSLFYESSSSICRAAIHYGILDDKGGLVDITRNGKVPFF 360 

GSTCNRYQCPA6CLNHKAKIFG+LFYESSSSICRAAIHYGILDDKGGLVDITRNGKVPFF 
TRYP : 300 GSTCNRYQCPAGCLNHKAKIFGTLFYESSSSICRAAIHYGILDDKGGLVDITRNGKVPFF 359 

N0V4d: 361 VKSERHGVQSLSBCYKPSSSFMVSKVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPAHCKD 420 

VKSERHGVQSLSKYKPSSSFMVSKVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPAHCKD 
TRYP : 360 VKSERHGVQSLSECYKPSSSFMVSKVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPAHCKD 419 

N0V4d: 421 EPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKKKTYVGSLRNGVQSE 480 

EPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKKKTYVGSLRNGVQSE 
TRYP : 420 EPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKKKTYVGSLRNGVQSE 479 

30 N0V4d: 481 SLGTPRDGKAFRIFAVRQ 498 (SEQ ID NO: 76) 
SLGTPRDGKAFRI FAVRQ 
TRYP : 480 SLGTPRDGKAFRIFAVRQ 497 (SEQ ID NO: 44) 

A PROSITE database search of protein families and domains confirmed that a N0V"4c 
35 polypq)tide is a member of the trypsin inhibitor family. One of the conserved regioiis found 
in trypsin inhibitois is a SCP domain, located at the C-terminal half The pattern of this 
conserved domam is: [LIVMFYH]-|XIVMFY]-x-C-[NQRHS]-Y^^ 

[LIVMFYWDN] (SEQ ID NO: 56). This pattern is found in amino acids 196-207 of SEQ ID 
NO: 21. 

40 Based on the relatedness between N0V-4d and the conserved trypsin inhibitor 

proteins, N0V-4d is a novel member of the trypsin inhibitor femily. N0V-4d provides new 
diagnostic and therq>eutic compositions useful in the treatment of disoixJers associated with 
alterations in the ejqpression of members of the trypsin inhibitor protem family. Nucleic acids, 
polypeptides, antibodies, and other compositions of the present invention are useful in the 

45 treatment and diagnosis of a variety of diseases and pathologies, including, by way of 
nonlimiting example, those nivolving reproductive disorders, immunological disorders, 
cancer, and metaboKc disorders. 

In addition, SignalPq) and PSORT analyses indicate that thaf N0V-4d is likely located 
outside of the cell (certainty = 0.6950), and is likely to have a cleavable N-terminal signal 
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sequeEQce with a cleavage site between positions 22 and 23: SQG-YL. 
molecular weight of N0V-4b is 561 14,4 daltons. 
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5 NOV-4e 

A N0V-4e sequence according to the inventioB is a nucleic add sequence that encodes 
a polypeptide related to trypsin inhibitor proteins. A disclosed N0V-4e nucleic acid and its 
encoded polypeptide are included in Table 33. The disclosed nucleic acid (SEQ ID NO: 22) is 
2412 nucleotides in length and contains an open reading frame (ORF) that begins with an 
10 ATG initiation codon at nucleotide 206, and ends with a TGA stop codon at nucleotide 1709. 
A disclosed, rq)resentative ORF encodes a 501 amino acid polypeptide (SEQ ID NO: 23). 

TABLE 33 

CTCTGACTGCTCCTATTGAGCTGTCTGCTCGCTGTGCCCGCTGTGCCTGCTGTGCCCG 

1 5 CGCTGTCGCCGCTGCTACCGCGTCTACTG6ACGCGGGAGACGCCAGCGAGCTGGTGAT 
TGGAGCCCTGCGGAGAGCTCAAGCGCCCAGCTCTGCCCGAGGAGCCCAGGCTGCCCCG 
TGAGTCCCATAGTTGCTGCAGGAGTGGAGCCATGAGCTGCGTCCTGGGTGGTGTCATC 
CCCTTGGGGCTGCTGTTCCTGGTCCGCGGATCCCAAGGCTACCTCCTGCCCAACGTCA 
CTCTCTTAGAGGAGCTGCTCAGCAAATACCAGCACAACGAGTCTCACTCCCGGGTCCG 

20 CAGAGCCATCCCCAGGGAGGACAAGGAGGAGATCCTCATGCTGCACAACAAGCTTCGG 
GGCCAGGTGCAGCCTCAGGCCTCCAACATGGAGTACATGACCTGGGATGACGAACTGG 
AGAAGTCTGCTGCAGCGTGGGCCAGTCAGTGCATCTGGGAGCACGGGCCCACCGGTCT 
GCTGGTGTCCATCGGGCAGAACCTGGGCGCTCACTGGGGCAGGTATCGCTCTCCGGGG 
TTCCATGTGCAGTCCTGGTATGACGAGGTGAAGGACTACT^CCTACCCCTACCCGAGCG 

25 AGTGCAACCCCTGGTGTCCAGAGAGGTGCTCGGGGCCCATGTGCACGCACTACACACA 
GGTAACTCAGATAGTTTGGGCCACCACCAACAAGATCGGTTGTGCTGTGAACACCTGC 
CGGTWVGATGACTGTCTGGGGAGAAGTTTGGGAGAACGCGGTCTACTTTGTCTGCAATT 
ATTCTCCAAAGAGGGGGAACTGGATTGGAGAAGCCCCCTACAAGAATGGCCGGCCCTG 
CTCTGAGTGCCCACCCAGCTATGGAGGCAGCTGCAGGAACAACTTGTGTTACCGAGAA 

30 GAAACCTACACTCCAAAACCTGAAACGGACGAGATGAATGAGGTGGA7VACGGCTCCCA 
TTCCTGAAGAAAACCATGTTTGGCTCCAACCGAGGGTGATGAGACCCACCAAGCCCAA 
GAAAACCTCTGCGGTCAACTACATGACCCAAGTCGTCAGATGTGACACCAAGATGAAG 
GACAGGTGCAAAGGGTCCACGTGTAACAGGTACCAGTGCCCAGCAGGCTGCCTGAAGC 
ACAAGGCGAAGATCTTTGGAAGTCTGTTCTATGAAAGCTCGTCTAGCATATGCCGCGC 

35 CGCCATCCACTACGGGATCCTGGATGACAAGGGAGGCCTGGTGGATATCACCAGGAAC 
GGGAAGGTCCCCTTCTTCGTGAAGTCTGAGAGACACGGCGTGCAGTCCCTCAGCAAAT 
ACAAACCTTCCAGCTCATTCATGGTGTCAAAAGTGAAAGTGCAGGATTTGGACTGCTA 
CACGACCGTTGCTCAGCTGTGCCCGTTTGAAAAGCCAGCAACTCACTGCCCAAGAATC 
CATTGTCCGGCACACTGCAAAGACGAACCTTCCTACTGGGCTCCGGTGTTTGGAACCA 
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ACATCTATGCAGATACCTCAAGCATCTGCAAGACAGCCGTGCACGCGGGAGTCATCAG 
CAACGAGAGTGGGGGTGACGTGGACGTGATGCCCGTGGATAAAAAGAAGACCTACGTG 
GGCTCGCTCAGGAATGGAGTTCAGTCTGAAAGCCTGGGGACTCCTCGGGATGGAAAGG 
CCTTCCGGATCTTTGCTGTCAGGCAGTGAATTTCCAGCACCAGGGGAGAAGGGGCGTC 
TTCAGGAGGGCTTCGGGGTTTTGCTTTTATTTTTATTTTGTCATTGCGGGGTATATGG 
AGAGTCAGGAAACTTCCTTTGACTGATGTTCAGTGTCCATCACTTTGTGGCCTGTGGG 
TGAGGTGACy^TCTCATCCCCTCACTGAAGCAACAGCATCCCAAGGTGCTCAGCCGGAC 
TCCCTGGTGCCTGATCCTGCTGGGGCCCGGGGGTCTCCATCTGGACGTCCTCTCTCCT 
TTAGAGATCTGAGCTGTCTCTTAAAGGGGACAGTTGCCCAAAATGTTCCTT6CTATGT 
GTTCTTCTGTTGGTGGAGGAAGTTGATTTCAACCTCCCTGCCAAAAGAACAAACCATT 
TGAAGCTCACAATTGTGAAGCATTCACGGCGTCGGAAGAGGCCTTTTGAGCAAGCGCC 
AATGAGTTTCAGGAATGAAGTAGAAGGTAGTTATTTAAAAATAAAAAACACAGTCCGT 
CCCTACCAATAGAGGAAAATGGTTTTAATGTTTGCTGGTCAGACAGACAAATGGGCTA 
GAGTAAGAGGGCTGCGGGTATGAGAGACCCCGGCTCCGCCCTGGCACGTGTCCTTGCT 
GGCGGCCCGCCACAGGCCCCCTTCAATGGCCGCATTCAGGATGGCTCTATACACAGCA 
GTGCTGGTTTATGTAAAGTTCAGCAGTCACTTCA (SEQ ID NO: 22) 

MSCVLGGVIPLGLLFLVRGSQGYLLPNVTLLEELLSKYQHNESHSRVRRAIPJ<EDKEE 
ILMLHNKLRGQVQPQASNMEYMTWDDELEKSAAAWASQCIWEHGPTGLLVSIGQNLGA 
20 HWGRyRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHYTQVTQIVWATTN 
KIGCAVNTCRKMTVWGEVWENAVYFVCNYSPiCRGNWIGEAPYKNGRPCSECPPSYGGS 
CRNNLCYREETYTPKPETDEMNEVETAPIPEENHVWLQPRVmPTECPKKTSAVNYMTQ 
WRCDTKMKDRCKGSTCNRYQCPAGCLNHKAKIFGSLFYESSSSICRAAIHYGILDDK 
GGLVDITRNGBCVPFFVKSERHGVQSLSBCYKPSSSFMVSKVKVQDLDCYTTVAQLCPFE 
25 KPATHCPRIHCPAHCKDEPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVM 
PVDKKKTYVGSLRNGVQSESLGTPRDGKAFRIFAVRQ (SEQ ID NO: 23) 

The disclosed N0V-4e amino acid sequence has a high level of homology (97% 
identity, 97% similarity) to a human trypsin inhibitor-like protein (GenBank Accesdon No: 
30 CAB66795), shown in Table 34. As indicated by the '^Expecf ' value, the probability of this 
alignment occurring by chance alone is 0.0, the lowest probability score. 

TABLE 34 

Score = 1001 bits (2588), Expect » 0.0 
35 Identities - 488/501 (97%), Positives = 489/501 (97%), Gaps - 4/501 (0%) 

N0V4e: 1 MSCVLGGVIPLGLLFLVRGSQGYIiLPNVTXXXXXXSKYQHNESHSRVRRAIPREDKEEIIi 60 

MSCVLGGVIPLGLLFLV GSQGYLLPNVT SKYQHNESHSRVRRAIPREDKEEIL 
TRYP : 1 MSCVLGGVIPLGLLFLVCGSQGYLLPNVTLLEELLSKYQHNESHSRVRRMPREDKEEIL 60 

40 

N0V4e: 61 MLHNKLRGQVQPQASNMEYMTWDDELEKSAAAWASQCIWEHGPTGLLVSIGQNLGAHWGR 120 

MLHNKLRGQVQPQASNMEYMTWDDELEKSAAAWASQCIWEHGPT LLVSIGQNLGAHWGR 
TRYP : 61 MLHNKLRGQVQPQASNMEYMTWDDELEKSAAAWASQCIWEHGPTSLLVSIGQNLGAHWGR 120 
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N0V4e: 121 YRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHYTQVTQIVWATTNKIGCAV 180 

YRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHY TQIVWATTNKIGCAV 
TRYP : 121 YRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHY TQIVWATTNKIGCAV 177 

N0V4e: 181 NTCRKMTVWGEVWENAVYFVCNYSPECRGNWIGEAPYKNGRPCSECPPSYGGSCRNNLCYR 240 

NTCRKMTVWGEVWENAVYFVCNYSPK GNWIGE7VPYKNGRPCSECPPSYGGSCRNNLCYR 
TRYP : 178 NTCRKMTVWGEVWENAVYFVCNYSPK-GNWIGEAPYKNGRPCSECPPSYGGSCRNNLCYR 236 

10 N0V4e: 241 EETYTPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKKTSAVNYMTQVVRCDTKMKD 300 
EETYTPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKKTSAVNYMTQWRCDTKMKD 
TRYP : 237 EETYTPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKKTSAVNYMTQVVRCDTKMKD 296 

N0V4e: 301 RCKGSTCNRYQCPAGCLNHKAKIFGSLFYESSSSICRAAIHYGILDDKGGLVDITRNGKV 360 
15 RCKGSTCNRYQCPAGCLNHKAKIFG+LFYESSSSICRAAIHYGILDDKGGLVDITRNGKV 

TRYP : 297 RCKGSTCNRYQCPAGCLNHKAKIFGTLFYESSSSICRAAIHYGILDDKGGLVDITRNGKV 356 

N0V4e: 361 PFFVKSERHGVQSLSKYKPSSSFMVSKVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPAH 420 
PFFVKSERHGVQSLSECYKPSSSFMVSKVKVQDIiDCYTTVAQLCPFEKPATHCPRIHCPAH 
20 TRYP : 357 PFFVKSERHGVQSLSKYKPSSSFMVSKVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPAH 416 

N0V4e: 421 CKDEPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKBCKTYVGSLRNGV 480 

CKDEPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKKKTYVGSLRNGV 
TRYP : 417 CKDEPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKKKTYVGSLRNGV 476 



25 



N0V4e: 481 QSESLGTPRDGKAFRIFAVRQ 501 (SEQ ID NO: 77) 

QSESLGTPEIDGKAFRIFAVRQ 
TRYP : 477 QSESLGTPRDGKAFRIFAVRQ 497 (SEQ ID NO: 45) 



30 In addition, SignalPep and PSORT analyses indicate that that N0V-4e is likely located 

outside of the cell (certainty = 0,6950), and is likely to have a cleavable N-terminal signal 
sequence with a cleavage site between positions 22 and 23: SQG-YL. The predicted 
molecular weight of N0V-4b is 56412.8 daltons. 

Based on the relatedness between N0V-4e and tiie conserved trypsin inhibitor 

35 i^roteins, the N0V-4e protein is a novel member of the trypsin inhibitor family. N0V-4e 
provides new diagnostic and therapeutic corD5)ositions usefiil in the treatment of disorders 
associated with alterations in the expression of members of the trypsin inhibitor protein 
family. Nucleic acids, polypeptides, antibodies, and other compositions of the present 
invention are useful in the treatment and diagnosis of a variety of diseases and pathologies, 

40 including, by way of nonlimiting exanq)le, those involving reproductive disorders, 
immunological disorders, cancer, and metabolic disorders. 

Table 35 shows a sequence aligmnent between the NOV-4 polypeptides according to 
the invention and a human trypsin inhibitor-like protein (GenBank Accession No: 
CAB66795), indicating the homology between the present invention and the trypsin uihibitor 

45 family. Moreover, the PROSITE conserved SCP region found in trypsin mhibitors is found in 
sequences 151-162 of the trypsin inihibitor-like protein shown (shown in bold font). 
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N0V4 e MSCVLGGVI PLGLLFLVRGSQGYLLPNVTLLEELLSKYQHNESHSRVRRAI PREDKEEIL 

N0V4a 

5 N0V4b MSCVLGGVI PLGLLFLVRGSQGYLLPNVTLLEELLSKYQHNESHSRVRRAI PREDKEEIL 

N0V4d MSCVLGGVIPLGLLFLVRGSQGYLLPNVTLLEELLSKYQHNESHSRVRRAIPREDKEEIL 

N0V4c MSCVLGGVIPLGLLFLVCGSQGYLLPNVTLLEELLSKYQHNESHSRVRRAIPREDKEEIL 

TRYP ' ARRKRYISQNDMIAIL 

10 

N0V4 e MLHNKLRGQVQPQASNME YMTWDDELEKSAAAWASQCIWEHGPTGLLVS IGQNLGAHWG- 

N0V4a MTNWG- 

N0V4b MLHNKLRGQVQPQASNMEYMTWDDELEKSAAAWASQCIWEHGPTGLLVSIGQNLGAHWG- 

N0V4 d MLHNKLRGQVQPQASNME YMTWDDELEKSAAAWASQCIWEHGPTSLLVSIGQNLGAHWGR 

1 5 N0V4 c MLHNKLRGQVQPQASNME YMTWDDELEKSAAAWASQCIWEHGPTGLLVS IGQNLGAHWG-- 

TRYP DYHNQVRGKVFPPAANMEYMVWDENLAKSAEAWAATCIWDHGPSYLLRFLGQNLSVRTG- 



20 
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N0V4e 
N0V4a 
N0V4b 
N0V4d 
N0V4C 
TRYP 



RYRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHYTQVTQIVWATTNKIGCA 

RYRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHY TQIVWATTNKIGCA 

RYRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHY TQIVWATTNKIGCA 

RYRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHY TQIVWATTNKIGCA 

RYRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHY TQIVWATTNKIGCA 

RYRSILQLVKPWYDEVKDYAFPYPQDCNPRCPMRCFGPMCTHY TQMVWATSNRIGCA 



30 
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40 



N0V4e 
N0V4a 
N0V4b 
N0V4d 
N0V4C 
TRYP 



N0V4e 
N0V4a 
N0V4b 
N0V4d 
N0V4C 
TRYP 



VNTCRKMTVWGEVWENAVyFVCNYSPKRGNWIGEAPYKNGRPCSECPPSYGGSCRNNLCY 
VNTCRKMTVWGEVWENAVYFVCNySPK-GNWIGEAPYKNGRPCSECPPSYGGSCRNNLCY 
VNTCRKMTVWGEVWENAVYFVCfcJYSPK-GNWIGEAPYKNGRPCSECPPSYGGSCRNNLCY 
VNTCRKMTVWGEVWENAVYFVCNYSPK-GNWIGEAPYKNGRPCSECPPSYGGSCRNNLCY 
VNTCRKMTVWGEVWENAVYFVCNYSPK-GNWIGEAPYKNGRPCSQCPPSYGGSCRNNLCY 
IHTCQNMNVWGSVWRRAVXLVCNYAPK-GNWIGEAPYKVGVPCSSCPPSYGGSCTDNLCF 
::**::*.***.**..***;****:** ********** * *** ^ ********* .***• 

REETYTPBOPETDEMNEVETAPIPEENHVWLQPRVMRPTKPICKTSAVNYMTQWRCDTKMK 
REETYTPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKKTSAVNYMTQWRCDTKMK 
REETYTPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKKTSAVNYMTQWRCDTKMK 
REETYTPKPETDEMNEVETAPIPEENflVWLQPRVMRPTKPKKTSAVNYMTQVVRCDTlKMK 
REETYTPICPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKKTSSVNYMTQWLCDTKMK 
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N0V4e 
N0V4a 
N0V4b 
N0V4d 
N0V4c 
TRYP 



N0V4e 
N0V4a 
N0V4b 
N0V4d 
N0V4C 
TRYP 



DRCKGSTCNRYQCPAGCLNHKAKIFGSLFYESSSSICRAAIHYGILDDKGGLVDITRNGK 
DRCKGSTCNRYQCPAGCLNHBCAKIFGTLFYESSSSICRAAIHYGILDDKGGLVDITRNGK 
DRCKiGSTCNRYQCPAGCLNHKAKIFGSLFYESSSSICRAAIHYGILDDKGGLVDITRNGK 
DRCKGSTCNRYQCPAGCLNHKAKIFGSLFYESSSSICRAAIHYGILDDKGGLVDITRNGK 
DRCKGSTCNRYQCPAGCLNHKAKIFGTLFYESSSSICRAAIHYGILDDKGGLVDITRNGK 



VPFFVKSERHGVQSLSECYKPSSSFMVSBCVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPA 
VPFFVKSERHGVQSLSBCYKPSSSFMVSKVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPA 
VPFFVKSERHGVQSLSKYKPSSSFMVSKVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPA 
VPFFVKSERHGVQSLSECYKPSSSFMVSKVBCVQDLDCYTTVAQLCPFEKPATHCPRIHCPA 
VPFFVKSERHGVQSLSKYKPSSSEWSKVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPA 



60 



N0V4e 
N0V4a 
N0V4b 
N0V4d 



HCKDEPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKKKTYVGSLRNG 
HCECDEPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKKKTYVGSLRNG 
HCKDEPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKKKTYVGSLRNG 
HCKDEPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKKKTYVGSLRNG 
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N0V4 c HCBCDEPSYWAPVEX3TNIYADTSSICECTAVHAGVISNESGGDVDVMPVDKKKTYT 

TRYP 

5 N0V4e VQSESLGTPRDGIOVFRIFAVRQ (SEQ ID NO: 23) 

N0V4a VQSESLGTPRDGBCAFRIFAVRQ (SEQ ID NO: 15) 

N0V4b VQSESLGTPRDGKAFRIFAVRQ (SEQ ID NO: 17) 

N0V4d VQSESLGTPRDGKAFRIFAVRQ (SEQ ID NO: 21) 

N0V4c CPAAARAL (SEQ ID NO: 19) 

10 TRYP (SEQ ID NO: 46) 



Consensus key 
*- single, fbUy conserved residue 
: - conservation of strong groi^ 
15 .-consovatLonof weakgroiq>s 
- no consensu3 

The e^qpression pattern, and protein similarity information for NOV-4 suggests that the 
human trypsin inhihitor-like proteins described in this invention may function as a trypsin 

20 inhibitor. Therefore, the nucleic acid and protein of the invention are usefiil in potential 

therapeutic applications implicated, for example but not limited to, in allergies and infectious 
diseases, in cancer, in metabolic disorders like obesity, hypertension and diabetes, and other 
disease and disorders. 

Homology to antigenic secreted and membrane proteins suggests that antibodies 

25 directed against the novel genes may be usefid in treatment and prevention of allergic 
reactions and mfectious diseases. E:q)ression in pituitary and adrenal gland suggests 
^er^eutic plications in metabolic disorders like obesity, hypertension and diabetes. 
Similarity to a brain tumor overexpressed trypsin inhibitor suggests that the splice variants of 
10093872 may be involved in the pathogenesis of these cancers. Hence it could be useful as a 

30 cancer diagnostic marker or as a target for small molecule trypsin inhibitors in cancer 
treatment. 

Potential ther^eutic uses for the invention(s) include, for example, the following: (i) 
protein ther^^utic, (ii) small molecule drug target, (iii) antibody target (ther^eutic, 
diagnostic, dmg targeting/cytotoxic antibody), (iv) diagnostic and/or prognostic marker, (v) 
35 gene therapy (gene delivery/gene ablation), (vi) research tools, and (vii) tissue regeneration in 
vitro and in vivo (regeneration for all these tissues and cell types composing these tissues and 
cell types derived fix)m these tissues). 

The nucleic acids and proteins of the invention are useful in potential ther^utic 
applications implicated in various diseases and disorders described below and/or other 
40 pattiologies and disorders. For example, but not limited to, a cDNA encoding the human 

trypsin inhibitor-like protein may be useful in gene therapy, and the human trypsin inhibitor- 
like protein may be useful when administered to a subject in need thereof By way of non- 
94 
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limiting example, the compositions of the present invention will have efficacy for treatment of 
patients suffering from, for example, but not limited to, in allergies and infectious diseases, in 
caner, in metabolic disorders like obesity, hypertension and diabetes, and other diseases and 
disorders. The novel nucleic acid encoding the human trypsin inhibitor-like protein, and the 
human trypsin inhibitor-like protein of the invention, or fragments thereof, may further be 
useful in diagnostic applications, wherein the presence or amount of the nucleic acid or the 
proteiQ are to be assessed. These materials are fiirfher usefiil in the generation of antibodies 
that bind insmunospecifically to the novel substances of the inventioti for use in ther^eutic or 
diagnostic methods. 



NOV-X Nucleic acids 

The nucleic adds of the invention include those that encode a NOV-X polypeptide or 
protein. As used herein, the terms polypeptide and pratein are interchangeable. 

In some embodiments, a NOV-X nucleic acid encodes a mature NOV-X polypeptide. 
As used herein, a *^ature" form of a polypeptide or protein described herein relates to the 
product of a naturally occurring polypeptide or precursor form or proprotein. The naturally 
occurring polypeptide, precursor or proprotein includes, by way of nonlimitmg example, the 
full-length gene product, encoded by the corresponding gene. Alternatively, it may be defined 
as the polypeptide, precursor or proprotein encoded by an open reading frame described 
herein. The product '^mature" form arises, again by way of nonlimiting example, as a result of 
one or more naturally occurring processing steps that may take place within the cell in which 
the gene product arises. Examples of such processing steps leading to a **mature" form of a 
polypeptide or protem include the cleavage of the N-terminal methionine residue encoded by 
the initiation codon of an open reading frame, or the proteolytic cleavage of a signal peptide or 
leader sequence. Thus a mature form arising from a precursor polypeptide or protein that has 
residues 1 to N, where residue 1 is the N-terminal methionine, would have residues 2 through 
N remaining after removal of the N-terminal methionine. Alternatively, a mature form arising 
from a precursor polypeptide or protein having residues 1 to N, in which an N-tenninal signal 
sequence from residue 1 to residue M is cleaved, would have the residues from residue M+1 to 
residue N remaining. Further as used herein, a '^mature" form of a polypeptide or protein may 
arise from a step of post-translational modification other than a proteolytic cleavage event 
Such additional processes include, by way of non-limiting example, glycosylation, 
myristoylation or phosphorylation. In general, a mature polypeptide or protein may result 
from the operation of only one of these processes, or a combination of any of tiiem. 
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Among the NOV-X nucleic acids is the nucleic acid whose sequence is provided in 
SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57, or a fragment thereof. Additionally, 
the invention includes mutant or variant nucleic acids of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 
16, 18, 20, 22, or 57, or a fragment thereof any of whose bases may be changed from the 
corresponding bases shown in SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57, while 
still encoding a protein that maintains at least one of its NOV-X-like activities and 
physiological functions (i.e., modulating angiogenesis, neuronal development). The invention 
further includes the complement of the nucleic acid sequence of SEQ ID NO: 1, 3, 6, 8, 10, 
12, 14, 16, 18, 20, 22, or 57, including fragments, derivatives, analogs and homologs thereof. 
The invention additionally includes nucleic acids or nucleic acid fragments, or complements 
thereto, whose structures include chemical modifications. 

One aspect of the invention pertains to isolated nucleic acid molecules that encode 
NOV-X protems or biologically active portions thereof. Also included are nucleic acid 
fragments suflBcient for use as hybridization probes to identify NOV-X-encodiug nucleic acids 
(e.g., NOV-X mRNA) and fragments for use as polymerase chain reaction (PGR) primers for 
the an5)Kfication or mutation of NOV-X nucleic acid molecules. As used herein, the term 
**nucleic acid molecule" is intended to include DNA molecules (e.g-, cDNA or genomic 
DNA), RNA molecules (e.g., mRNA), analogs of the DNA or RNA generated using 
nucleotide analogs, and derivatives, fragments and homologs thereof. The nucleic acid 
molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA. 

**Probes" refer to nucleic acid sequences of variable length, preferably between at least 
about 10 nucleotides (nt), 100 nt, or as many as about, e.g., 6,000 nt, depending on use. 
Probes are used in the detection of identical, similar, or complementary nucleic acid 
sequences. Longer length probes are usually obtained from a natural or recombinant sourx:e, 
are highly specific and much slower to hybridize than oligomers. Probes may be single- or 
double-stranded and designed to have specificity in PGR, membrane-based hybridization 
technologies, or ELISA-like technologies. 

An "isolated" nucleic acid molecule is one that is sq)arated Scorn other nucleic add 
molecules that are present in the natuml source of the nucleic acid. Examples of isolated 
nucleic acid molecules include, but are not limited to, recombinant DNA molecules contained 
in a vector, recombinant DNA molecules maintained in a heterologous host cell, partially or 
substantially purified nucleic acid molecules, and synthetic DNA or RNA molecules. 
Preferably, an "isolated" nucleic acid is free of sequences which naturally flank the nucleic 
acid (i.e,, sequences located at the 5' and 3* ends of the nucleic acid) in the genomic DNA of 
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the oiganism jBoom which the nucleic acid is derived For example, in various CTibodiments, 
flie isolated NOV-X nucleic acid molecule can contain less than about 50 kb, 25 kb, 5 kb, 4 
kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of nucleotide sequences which naturally flank the nucleic 
acid molecule in genomic DNA of the cell fiiom which the nucleic acid is derived Moreover, 
5 an "isolated" nucleic acid molecule, such as a cDNA molecule, can be substantially free of 
other cellular material or culture mediimi when produced by recombinant techniques, or of 
chemical precursors or oflier chemicals when chemically synthesized. 

A nucleic acid molecule of the present invention, e.g., a nucleic acid molecule having 
the nucleotide sequence of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57, or a 
10 complement of any of this nucleotide sequence, can be isolated using standard molecular 
biology techniques and the sequence information provided herein. Using all or a portion of 
the nucleic add sequence of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57, as a 
hybridization probe, NOV-X nucleic acid sequences can be isolated using standard 
hybridiziation and cloning techniques (e.g., as described in Sambrook et al., eds.. Molecular 
1 5 Cloning: A Laboratory Manual 2"*^ Ed., Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, NY, 1989; and Ausubel, et al., eds., CURRENT Protocols in Molecular 
Biology, John Wiley & Sons, New York, NY, 1993.) 

A nucleic acid of the invention can be amplified using cDNA, naRNA or alternatively, 
genomic DNA, as a template and ^propriate oligonucleotide primers according to standaid 
20 PCR amplification techniques. The nucleic acid so amplified can be cloned into an 
appropriate vector and characterized by DNA sequence analysis. Furthermore, 
oligonucleotides corresponding to NOV-X nucleotide sequences can be prepared by standard 
synthetic techniques, e.g., using an automated DNA synthesizer. 

As used herein, the term "oligonucleotide" refers to a series of linked nucleotide 
25 residues, which oligonucleotide has a sufBcient number of nucleotide bases to be used in a 
PCR reaction. A short oligonucleotide sequence may be based on, or designed fiom, a 
genomic or cDNA sequence and is used to amplify, confirm, or reveal the presence of an 
identical, similar or complementary DNA or UNA in a particular cell or tissue. 
Oligonucleotides comprise portions of a nucleic acid sequence having about 10 nt, 50 nt, or 
30 100 nt in length, preferably about 15 nt to 30 nt in length. In one embodiment, an 

oligonucleotide comprising a nucleic acid molecule less than 100 nt in lengfli would further 
comprise at lease 6 contiguous nucleotides of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 
22, or 57, or a complement thereof. OHgonucleotides may be chemically synthesized and may 
be used as probes. 
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In anollxer embodiment, an isolated nucleic acid molecule of the invention comprises a 
nucleic acid molecule that is a complement of the nucleotide sequence shown in SEQ ID NO: 
1,3,6,8, 10, 12, 14, 16, 18,20, 22, or 57, or a portion of this nucleotide sequence. Anucleic 
add molecule lhat is complementary to the nucleotide sequence shown in SEQ ID NO: 1, 3, 
5 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57 is one that is sufficiently complementary to the 

nucleotide sequence shown in SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57 that it 
can hydrogen bond with little or no mismatches to the nucleotide sequence shown in SEQ ID 
NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57, fliereby forming a stable duplex. 

As used herein, the term "complementary'' refers to Watson-Crick or Hoogsteen base 

10 pairing between nucleotide units of a nucleic add molecule, and the term *1)inding*' means the 
physical or chemical interaction between two polypeptides or compounds or associated 
polypeptides or compounds or combinations thereof Binding includes ionic, non-ionic. Von 
der Waals, hydrophobic interactions, etc. A physical interaction can be dther direct or 
mdirect. Indirect interactions may be through or due to the effects of another polypq>tide or 

1 5 compound Direct binding refers to interactions that do not take place through, or due to, the 
effect of another polypeptide or compound, but instead are without other substantial chemical 
intermediates. 

Moreover, the nucleic acid molecule of the invention can comprise only a portion of 
the nucleic acid sequence of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57, e.g., a 

20 fragment that can be used as a probe or primer, or a fragment encoding a biologically active 
portion of NOV-X. Fragments provided herein are defined as sequences of at least 6 
(contiguous) nucleic acids or at least 4 (contiguous) amino acids, a length sufBdent to allow 
for specific hybridization in the case of nucleic acids or for specific recognition of an epitope 
in the case of amino acids, respectively, and are at most some portion less than a fiiU length 

25 sequence. Fragments may be derived fit)m any contiguous portion of a nucleic acid or amino 
acid sequence of choice* Derivatives are nucleic acid sequences or amino acid sequences 
formed from the native compounds either directly or by modification or partial substitutioiL 
Analogs are nucleic acid sequences or amino acid sequaices that have a structure similar to, 
but not identical to, the native compound but differs firom it in respect to certain components 

30 or side chains. Analogs may be synthetic or from a different evolutionary origin and may have 
a similar or opposite metaboUc activity compared to wild type- 
Derivatives and analogs may be fiiU length or other than fiiU length, if the derivative or 
analog contains a modified nucleic acid or amino acid, as described below. Derivatives or 
analogs of the nucleic acids or proteins of the invention include, but are not limited to. 
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molecules comprising regions that are substantiaUy homologous to the nucleic acids or 
proteins of the invention, in various embodiments, by at least about 70%, 80%, 85%, 90%, 
95%, 98%, or even 99% identity (with a preferred identity of 80-99%) over a nucleic acid or 
amino acid sequence of identical size or when conq)ared to an aligned sequence in which the 
alignment is done by a computer homology program known in the art, or whose encoding 
nucleic add is capable of hybridizing to Ihe complement of a sequence encoding the 
aforementioned proteins under stringent, moderately stringent, or low stringent conditions. 
See e.g. Ausubel, et al.. Current Protocols in Molecular Biology, John Wiley & Sons, 
New York, NY, 1993, and below. An exemplary program is the Gap program (Wisconsin 
Sequence Analysis Package, Version 8 for UNIX, Genetics Computer Groi^, Universily 
Research Park, Madison, Wl) using the defeult settings, which uses the algorithm of Smith and 
Watennan (Adv. Appl. Math., 1981, 2: 482-489, which is incorporated herein by reference in 
its entirety). 

A "homologous nucleic acid sequence" or "homologous amino acid sequence," or 
variations thereof refer to sequences characterized by a homology at the nucleotide level or 
amino acid level as discussed above. Homologous nucleotide sequences encode those 
sequences coding for isoforms of a NOV-X polypeptide. Isofoims can be expressed in 
different tissues of the same organism as a result o^ for example, alternative splicing of RNA. 
Alternatively, isoforms can be encoded by different genes. In the present invention, 
hornologous nucleotide sequences include nucleotide sequeaoces encoding for a NOV-X 
polypeptide of species other than humans, including, but not limited to, mammals, and thus 
can include, e.g., mouse, rat, rabbit, dog, cat cow, horse, and other organisms. Homologous 
nucleotide sequaices also include, but are not limited to, naturally occurring alleUc variations 
and mutations of the nucleotide sequences set forth herein, A homologous nucleotide 
sequence does not, however, include the nucleotide sequence encoding human NOV-X 
protein. Homologous nucleic acid sequences inchide those nucleic add sequences that aacode 
conservative amino add substitutions (see below) in SEQ ID NO: 2, 4, 5, 7, 9, 11, 13, 15, 17, 
19, 21, or 23, as well as a polypeptide having NOV-X activity. Biological activities of the 
NOV-X protdns are described below. A homologous amino add sequence does not encode 
the amino add sequence of a human NOV-X polypeptide. 

The nucleotide sequence determined ftom the cloning of the human NOV-X gene 
allows for the generation of probes and primers designed for use in identifying and/or cloning 
NOV-X homologues in other cell types, e.g., Smm other tissues, as weU as NOV-X 
homologues fiom other mammals. The probe/i)rimer typicaUy comprises a substantially 
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purified oligonucleotide. The oligonucleotide typically comprises a region of nucleotide 
sequence that hybridizes under stringent conditions to at least about 12, 25, 50, 100, 150, 200, 
250, 300, 350 or 400 or more consecutive sense strand nucleotide sequence of SEQ ID NO: 1, 
3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57; or an anti-sense strand nucleotide sequence of SEQ 
ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57; or of a naturally occurring mutant of SEQ 
ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57. 

Probes based on tiie human NOV-X nucleotide sequence can be used to detect 
transcripts or genomic sequences encoding the same or homologous proteins. Jn various 
embodiments, the probe further comprises a label group attached thereto, e.g., the label groiq) 
can be a radioisotope, a fluorescent compound, an en2yme, or an enzyme co-factor Such 
probes can be used as a part of a diagnostic test kit for identifying cells or tissue which 
misexpress a NOV-X protem, such as by measuring a level of a NOV-X-encoding nucleic acid 
in a sample of cells fiom a subject e.g., detecting NOV-X mRNA leveis or determining 
whether a genomic NOV-X gene has been mutated or deleted. 

A ^^polypeptide having a biologically active portion of NOV-X** refers to polypeptides 
exhibiting activity similar, but not necessarily identical to, an activity of a polypeptide of the 
present invention, including mature forms, as measured iu a particular biological assay, with 
or without dose dependency. A nucleic acid fragment aicoding a 'T)iologically active portion 
of NOV-X" can be prepared by isolating a portion of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 
1 8, 20, 22, or 57 that encodes a polypeptide having a NOV-X biological activity (biological 
activities of the NOV-X proteins are described below), expressing the encoded portion of 
NOV-X protein (e.g., by recombmant expression in vitro) and assessing the activity of tihie 
encoded portion of NOV-X. For example, a nucleic acid fragment encoding a biologically 
active portion of NOV-X can optionally include an ATP-binding domain. Li another 
embodiment, a nucleic acid fragment encoding a biologically active portion of NOV-X 
includes one or more regions. 

NOV-X Variants 

The invention further encompasses nucleic acid molecules that differ fiom the 
nucleotide sequences shown in SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57 due to 
the degeneracy of the genetic code. These nucleic adds thus encode the same NOV-X protein 
as that encoded by the nucleotide sequence shown in SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 
20, 22, or 57 e.g., the polypeptide of SEQ ID NO: 2, 4, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, or 23. In 
another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide 
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sequence encoding a protein having an amino acid sequence shown in SEQ ID NO: 2, 4, 5, 7, 
9,11, 13,15, 17, 19, 21, or 23. 

la addition to the human NOV-X nucleotide sequence shown in SEQ ID NO: 1, 3, 6, 8, 
10, 12, 14, 16, 18, 20, 22, or 57, it will be qrpreciated by those skilled in the art that DNA 
5 sequence polymorphisms that lead to changes in the amino acid sequences of NOV-X may 
exist within a population (e.g., the human population). Such genetic polymorphism in the 
NOV-X gene may exist among individuals within a population due to natural allelic variation. 
As used herein, the terms "gene" and "recombinant gene" refer to nucleic acid molecules 
conq)rising an open reading frame encoding a NOV-X protein, preferably a maniTTialian NOV- 
10 X proteuL Such natural allelic variations can typically result in 1-5% variance in the 

nucleotide sequence of the NOV-X gene. Any and all such nucleotide variations and resulting 
amino acid polymorphisms in NOV-X that are the result of natural allelic variation and that do 
not alter the fimctional activity of NOV-X are intended to be within the scope of the invention. 
Moreover, nucleic acid molecules encoding NOV-X proteins from other species, and 
15 tiius that have a nucleotide sequence that differs from the hiunan sequence of SEQ ID NO: 1, 
3, 6, 8, 10, 12, 14, 1 6, 1 8, 20, 22, or 57 are intended to be within the scope of the invention. 
Nucleic acid molecules corresponding to natural allelic variants and homologues of the NOV- 
X cDNAs of the invention can be isolated based on their homology to the human NOV-X 
nucleic acids disclosed herein using the human cDNAs, or a portion thereof as a hybridization 
20 probe according to standard hybridization techniques under stringent hybridization conditions. 
For example, a soluble human NOV-X cDNA can be isolated based on its homology to human 
membrane-bound NOV-X. Likewise, a membrane-bound human NOV-X cDNA can be 
isolated based on its homology to soluble human NOV-X. 

Accordingly, in another embodiment, an isolated nucleic acid molecule of the 
25 invention is at least 6 nucleotides in length and hybridizes under stringent conditions to the 
nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 1, 3, 6, 8, 10, 12, 
14, 16, 18, 20, 22, or 57. In another embodiment, the nucleic acid is at least 10, 25, 50, 100, 
250, 500 or 750 nucleotides in length. In another embodiment, an isolated nucleic acid 
molecule of the invention hybridizes to the coding region. As used herein, the term 
30 "hybridizes under stringent conditions" is intended to describe conditions for hybridization and 
washing under which nucleotide sequences at least 60% homologous to each other typicaUy 
remain hybridized to each other. 

Homologs (i.e., nucleic acids encoding NOV-X protems derived from species other 
than human) or other related sequences (e.g,, paralogs) can be obtained by low, moderate or 
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high stringency hybridization with all or a portion of the particular human sequence as a probe 
using methods well known in the art for nucleic acid hybridization and cloning. 

As used herein, the phrase "stringent hybridization conditions" refers to conditions 
under which a probe, primer or oligonucleotide will hybridize to its target sequence, but to no 
5 other sequences. Stringent conditions are sequence-dependent and will be different in 

different circumstances. Longer sequences hybridize specifically at higher tenq>eratures than 
shorter sequences. Generally, stringent conditions are selected to be about 5°C lower than the 
thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH, The 
Tm is the temperature (under defined ionic strength, pH and nucleic acid concentration) at 
10 which 50% of the probes corccplementary to the target sequence hybridize to the target 

sequence at equilibrium. Since the target sequences are generally present at excess, at Tm, 
50% of the probes are occupied at equilibrium. Typically, stringent conditions will be those in 
which the salt concentration is less than about 1.0 M sodium ion, typically about 0.01 to 1.0 M 
sodium ion (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30*^0 for short 
1 5 probes, primers or oligonucleotides (e.g., 10 nt to 50 nt) and at least about 60X for longer 
probes, primers and oligonucleotides. Stringent conditions may also be achieved with the 
addition of destabilizing agents, such as formamide. 

Stringent conditions are Imown to those skiUed in the art and can be found in 
I^OTOCOLS IN Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6. 
20 Preferably, the conditions are such that sequences at least about 65%, 70%, 75%, 85%, 90%, 
95%, 98%, or 99% homologous to each other typically remain hybridized to each other. 
A non-limiting example of stringent hybridization conditions is hybridization in a high salt 
buffer comprising 6X SSC, 50 mM Tris-HCl (pH 7.5), 1 mM EDTA, 0.02% PVP, 0.02% 
FicoU, 0.02% BSA, and 500 mg/ml denatured salmon sperm DNA at 65°C. This hybridization 
, 25 is followed by one or more washes in 0.2X SSC, 0.01 % BSA at 50°C, An isolated nucleic 
acid molecule of the iavention that hybridizes under stringent conditions to the sequence of 
SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57 corresponds to anaturally occurring 
nucleic acid molecule. As used herein, a "naturally-occurring" nucleic acid molecule refers to 
an KNA or DNA molecule having a nucleotide sequence that occurs in nature (e.g., encodes a 
30 natural protein). 

Iq a second embodiment, a nucleic acid sequence that is hybridizable to the nucleic 
add molecule comprising the nucleotide sequence of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 
18, 20, 22, or 57, or fragmente, analogs or derivatives thereof, under conditions of moderate 
stringency is provided, A non-litniting example of moderate stringency hybridization 
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conditions are hybridization in 6X SSC, 5X Denhardt's solution, 0.5% SDS and 100 mg/ml 
denatured salmon sperm DNA at 55°C, followed by one or more washes in IX SSC, 0.1% 
SDS at 37°C. Other conditions of moderate stringency that may be used are well known in the 
art. See, e.g., Ausubel et al. (eds.), 1993, OuiotEm'PRCrrocoiJS inMoi^^ 
John Wiley & Sons, NY, and Kriegler, 1990, Gene Transfer and Expression, A 
Laboratory Manual, Stockton Press, NY. 

In a third embodiment, a nucleic acid that is hybridizable to the nucleic acid molecule 
comprising the nucleotide sequence of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 
57, or fragments, analogs or derivatives thereof, under conditions of low stringency, is 
provided. A non-limiting example of low stringency hybridization conditions are 
hybridization m 35% formamide, 5X SSC, 50 mM Tris-HCl (pH 7.5), 5 mM EDTA, 0.02% 
PVP, 0.02% FicoU, 0.2% BSA, 100 mg/ml denatured sahnon spenn DNA, 10% (wt/vol) 
dextran sulfate at 40°C, followed by one or more washes in 2X SSC, 25 mM Tris-HCl (pH 
7.4), 5 mM EDTA, and 0.1% SDS at 50°C. Other conditions of low stringency that may be 
used are well known in the art (e.g., as employed for cross-species hybridizations). See, e.g., 
Ausubel et al. (eds.), 1993, Current Protocols in Molecular Biology, John Wiley & 
Sons, NY, and Kriegler, 1990, Gene Transfer and Expression, A Laboratory MANUAL, 
Stockton Press, NY; Shilo and Weinbeig, 1981, Pioc Natl Acad Sci USA 78: 6789-6792, 

Conservative mutations 

hi addition to naturally-occurring allelic variants of the NOV-X sequence that may 
exist in the population, the skilled artisan will further appreciate that changes can be 
introduced by mutation into the nucleotide sequence of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 
18, 20, 22, or 57, thereby leading to changes in the amino acid sequence of the encoded NOV- 
X protein, wifliout altering the functional ability of the NOV-X protem. For example, 
nucleotide substitutions leading to amino acid substitutions at "non-essential" amino acid 
residues can be made in the sequence of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 
57. A "non-essential" amino acid residue is a residue that can be altered from Ihe wild-type 
sequence of NOV-X without altering the biological activity, whereas an "essential" amino acid 
residue is required for biological activity. For example, amino acid residues that are 
conserved among the NOV-X protems of the present invention, are predicted to be particularly 
unamenable to alteration. 

Another aspect of the mvention pertams to nucleic acid molecules encoding NOV-X 
protems that contain changes in amino acid residues that are not essential for activity. Such 
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NOV-X proteins differ in amino acid sequence from SEQ ID NO: 2, 4, 5, 7, 9, 1 1, 13, 15, 17, 
19, 21, or 23, yet retain biological activity. In one embodimeat, the isolated nucleic acid 
molecule comprises a nucleotide sequence encoding a protein, wherein the protein comprises 
an amino add sequence at least about 75% homologous to the amino acid sequence of SEQ ID 
5 NO: 2, 4, 6, or 8. Preferably, the protein encoded by the nucleic acid is at least about 80% 
homologous to SEQ ID NO: 2, 4, 5, 7, 9, 1 1 , 13, 15, 17, 19, 21, or 23, more preferably at least 
about 90%, 95%, 98%, and most preferably at least about 99% homologous to SEQ ID NO: 2, 
4, 5, 7, 9, 11, 13, 15, 17, 19, 21, or 23. 

An isolated nucleic acid molecule encoding a NOV-X protein homologous to the 
10 protein of can be created by introducing one or more nucleotide substitutions, additions or 
deletions into the nucleotide sequence of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 
57, such that one or more amino acid substitutions, additions or deletions are introduced into 
the encoded protein. 

Mutations cian be introduced into the nucleotide sequence of SEQ ID NO: 1, 3, 6, 8, 

15 10, 12, 14, 16, 18, 20, 22, or 57 by standard techniques, such as site-directed mutagenesis and 
PCR-mediated mutagenesis. Preferably, conservative anoino acid substitutions are made at 
one or more predicted non-essential amino acid residues. A "conservative amino add 
substitution" is one in which the amino add residue is replaced with an amino add residue 
having a similar side chain. Families of amino acid residues having similar side chains have 

20 been defined ia the art. These families iaclude anoino acids with basic side chains (e.g., lysine, 
argmine, histidine), acidic side chains (e.g., aspartic add, glutamic add), uncharged polar side 
chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), norspolar 
side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, 
tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side 

25 chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential 
amino acid residue in NOV-X is replaced with another amino acid residue from tiie same side 
chain family. Alternatively, in another embodiment, mutations can be introduced randomly 
along all or part of a NOV-X coding sequence, such as by saturation mutagenesis, and the 
resultant mutants can be screened for NOV-X biological activity to identify mutants that retain 

30 activity. Following mutagenesis of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57 
the encoded protein can be expressed by any recombinant technology known in the art and the 
activity of the protein can be determined. 

In one embodiment, a mutant NOV-X protein can be assayed for (1) the ability to form 
protdnrprotein interactions with other NOV-X proteins, other cell-sur&ce proteins, or 
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biologically active portions fliereof, (2) complex fonnation between a mutant NOV-X protein 
and a NOV-X receptor, (3) the ability of a mutant NOV-X protein to bind to an intracellular 
target protein or biologically active portion thereof (e.g., avidin proteins); (4) the ability to 
bind NOV-X protein; or (5) the ability to specifically bind an anti-NO V-X protein antibody. 

5 

Antisense NOV-X Nucleic acids 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules 
thiat are hybridizable to or complementary to the nucleic acid molecule comprising the 
nucleotide sequence of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57, or fragments, 

10 analogs or derivatives thereof. An "antisense" nucleic acid comprises a nucleotide sequence 
that is complementary to a "sense" nucleic acid encoding a protein, e.g., complementary to the 
coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. 
In specific aspects, antisense nucleic acid molecules are provided that comprise a sequence 
complementary to at least about 10, 25, 50, 100, 250 or 500 nucleotides or an entire NOV-X 

15 coding strand, or to only a portion thereof Nucldc acid molecules encoding fragments, 
homologs, derivatives and analogs of a NOV-X protein of SEQ ID NO: 2, 4, 5, 7, 9, 11, 13, 
15, 17, 19, 21, or 23 or antisense nucleic acids complementary to a NOV-X nucleic acid 
sequence of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57 are additionally provided. 
Li one embodiment, an antisense nucleic acid molecule is antisense to a "coding 

20 region" of the coding strand of a nucleotide sequence encoding NOV-X. The term "coding 

region" refers to tiie region of the nucleotide sequence comprising codons which are translated 
into amino acid residues (e.g., the protein coding region of human NOV-X corresponds to 
SEQ ID NO: 2, 4, 5, 7, 9, 11, 13, 15, 17, 19, 21, or 23). In another embodiment, the antisense 
nucleic acid molecule is antisense to a "noncoding region" of the coding strand of a nucleotide 

25 sequence encoding NOV-X. The term "noncoding region" refers to 5' and 3' sequences which 
flank the coding region that are not translated into amino acids (i.e., also referred to as 5' and 
3' untranslated regions). 

GivOT the coding strand sequences encoding NOV-X disclosed herein (e.g., SEQ ID 
NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57), antisense nucleic acids of the invention can be 

30 designed according to the rules of Watson and Crick or Hoogsteen base pairing. The antisense 
nucleic acid molecule can be complementary to the entire coding region of NOV-X mKNA, 
but more preferably is an oUgonucleotide that is antisense to only a portion of the coding or 
noncoding region of NOV-X mKNA. For example, the antisense oligonucleotide can be 
complementary to the region surrounding the translation start site of NOV-X mRNA. An 



105 



wo 01/62928 PCT/USOl/06151 
antisense oligonucleotide can be, for example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50 
nucleotides in length- An antisense nucleic acid of the invention can be constructed using 
chemical synthesis or enzymatic ligation reactions using procedures known in the art For 
example, an antisense nucleic acid (e.g., an antisense oUgonucleotide) can be chemically 
synthesized using naturally occurring nucleotides or variously modified nucleotides designed 
to increase the biological stability of the molecules or to increase the physical stability of the 
duplex formed betwem the antisense and sense nucleic acids, e.g., phosphorothioate 
derivatives and aoidine substituted nucleotides can be used. 

Examples of modified nucleotides that can be used to generate the antisense nucleic 
acid include: 5-fluoTOuracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, 
xanfliine, 4-acetylcytosine, 5-(carboxyhydroxyhnethyl) uracil, 5-carboxymethylaminomethyl- 
2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, 
inosine, N6-isopentenyladeaine, 1-mefhylguanine, 1-methylinosine, 2^-dimethylguanine, 
2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 
7-mefhylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-fhiouracil, 
beta-D-mannosylqueosine, 5'-mefhoxycaiboxymethyluracil, 5-methoxyuracil, 

2- methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouradl, 
queosine, 2-thiocytosine, 5-mefhyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, 
uracil-5-oxyacetic acid mefhylester, uracil-5-oxyacetic acid (v), 5-methyl-2-fhiouiacil, 

3- (3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopuiine. Alternatively, flie 
antisense nucleic acid can be produced biologically xxshig an expression vector into which a 
nucleic acid has been subcloned in an antisense orientation (Le., RNA transcribed &om the 
inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, 
described fiorther in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 
subject or generated in situ such that they hybridize with or bind to cellular.mRNA and/or 
genomic DNA encoding a NOV-X protein to thereby inhibit expression of the protein, e.g., by 
inhibiting transcription and/or translation. The hybridization can be by conventional 
nucleotide complementarity to form a stable duplex, or, for example, in the case of an 
antisense nucleic add molecule fliat binds to DNA duplexes, througji specific interactions in 
the major groove of the double helix. An example of a route of administration of antisense 
nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, 
antisense nucleic acid molecules can be modified to target selected cells and then administered 
systCTnically. For example, for systemic administration, antisense molecules can be modified 
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such that fhey specifically bind to receptors or antigens expressed on a selected cell sur&ce, 
e.g., by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell 
surface receptors or antigens. The antisense nucleic acid molecules can also be delivered to 
cells using the vectors described herein. To achieve sufficient intracellular concentrations of 
5 antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed 
under the control of a strong pol II or pol HI promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 
-a n omeric nucleic acid molecule. An -anomeric nucleic acid molecule forms specific 
double-stranded hybrids with complementary RNA in which, contrary to the usual -units, the 
1 0 strands run parallel to each otiier (Gaultier et al. (1 987) Nucleic acids Res 1 5 : 6625-6641). 
The antisense nucleic acid molecule can also comprise a 2 -o-methylxibonucleotide (Inoue et 
al (1987) Nucleic acids Res 15: 6131-6148) or a chimeric RNA -DNA analogue (bxone et al. 
(1987) FEBS Lett 215: 327-330). 

Such modifications include, by way of nonlimiting example, modified bases, and 
1 5 nucleic acids whose sugar phosphate backbones are modified or derivatized. These 

modifications are carried out at least in part to enhance the chemical stability of the modified 
nucleic add, such that they may be used, for example, as antisense binding nucleic acids in 
flier^eutic implications in a subject. 

20 NOV-X Ribozymes and PNA moieties 

In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 
Ribozymes are catalytic RNA molecules with ribonuclease activity that are enable of 
cleaving a single-stranded nucleic acid, such as a mRNA, to which they have a complementary 
region. Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhoff and Gerlach 

25 (1988) Nature 334:585-591)) can be used to catalytically cleave NOV-X mRNA transcripts to 
thereby inhibit translation of NOV-X mRNA. A ribozyme havmg specificity for a NOV- 
X-encoding nucleic acid can be designed based upon the nucleotide sequence of a NOV-X 
DNA disclosed herein (i.e., SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57). For 
example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed m which the 

30 nucleotide sequence of the active site is complementary to the nucleotide sequence to be 
cleaved in a NOV-X-encoding mRNA. See, e.g., Cech et al. U.S. Pat. No. 4,987,071; and 
Cech et aL U.S. Pat No. 5,1 16,742. Alternatively, NOV-X mRNA can be used to select a 
catalytic RNA having a specific ribonuclease activity fiom a pool of RNA molecules. See, 
e.g., Bartel et al., (1993) Science 261:1411-1418. 
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Alternatively, NOV-X geae expression can be inhibited by targeting nucleotide 
sequences complemeataiy to the regulatory region of the NOV-X (e.g., the NOV-X promoter 
and/or enhancers) to form triple helical stractures that prevent transcription of the NOV-X 
gene in target cells. See generally, Helene. (1991) Anticancer Drug Des. 6: 569-84; Helene. et 
5 al. (1992) Ann. N.Y. Acad. Sci, 660:27-36; and Maher (1992) Bioassays 14: 807-15. 

hi various embodiments, the nucleic acids of NOV-X can be modified at the base 
moiety, sugar moiety or phosphate backbone to inq)rove, e.g., the stability, hybridization, or 
solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic 
acids can be modified to generate peptide nucleic acids (see Hynq) et al. (1996) Bioorg Med 

10 Chem 4: 5-23). As used hereia, the terms "peptide nucleic acids" or "PNAs" refer to nucleic 
acid mimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone is replaced by 
a pseudopeptide backbone and only the four natural nucleobases are retamed. The neutral 
backbone of PNAs has been shown to allow for specific hybridization to DNA and KNA under 
conditions of low ionic strength. The synthesis of PNA oHgomers can be performed using 

i 5 standard solid phase peptide synthesis protocols as described in Hyrap et al. (1996) above; 
Perry-OKeefe et al. (1996) PNAS 93: 14670-675. 

PNAs of NOV-X can be used in therq>eutic and diagnostic appUcations. For example, 
PNAs can be used as antisense or antigene ageats for sequence-specific modulation of gene 
expression by, e.g., tndudng transcription or translation arrest or inhibiting replication. PNAs 

20 of NOV-X can also be used, e.g., in the analysis of single base pair mutations in a gene by, 
e.g., PNA directed PGR clano^ing; as artificial restriction enzymes when used in combination 
with other enzymes, e.g., SI nucleases (Hynq) B. (1996) above); or as probes or primers for 
DNA sequence and hybridization (Hyrq) et al. (1996), above; Perry-O'Keefe (1996), above). 
In another embodiment, PNAs of NOV-X can be modified, e.g., to enhance their 

25 stabihty or cellular iq)take, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
dehvery known in the art For example, PNA-DNA chimeras of NOV-X can be generated that 
may combine the advantageous properties of PNA and DNA. Such chimeras allow DNA 
recognition enzymes, e,g., RNase H and DNA polymerases, to interact with the DNA portion 

30 while the PNA portion would provide high binding afGnity and specificity. PNA-DNA 

chimeras can be Unked using linkers of appropriate lengths selected in terms of base stacking, 
number of bonds between the nucleobases, and orientation (Hyrap (1996) above). The 
synthesis of PNA-DNA chimeras can be performed as described in Hyrup (1996) above and 
Finn et al. (1996) Nucl Acids Res 24: 3357-63. For example, a DNA chain can be synthesized 
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on a solid siq>poTt using standard phosphoramidite coupling chemistry, and modifi^ 
nucleoside analogs, e,g,, 5'-(4-mefhoxytrityl) amino-5*-deoxy-thymidine phosphoramidite, can 
be used between the PNA and the 5' end of DNA (Mag et aL (1989) Nucl Acid Res 17: 
5973-88). PNA monomers are then coupled in a stepwise manner to produce a chimeric 
5 molecule with a 5' PNA segment and a 3' DNA segment (Finn et al. (1996) above). 

Alternatively, chimeric molecules can be synthesized with a 5' DNA segment and a 3' PNA 
segment. See, Petersen et al. (1975) Bioorg Med Chem Lett 5:1119-11 124. 

hi other embodiments, the oligonucleotide may include other qjpended groups such as 
peptides (e.g., for targeting host cell receptors in vivo), or agents facihtating transport across 

10 the cell membrane (see, e.g., Letsinger et al., 1989, Proc. Natl. Acad. Sci. U.S.A. 

86:6553-6556; Lemaitre et al., 1987, Proc. Natl. Acad. Sci. 84:648-652; PCX PubUcation No, 
W088/09810) or the blood-brain barrier (see, e.g., PCT PubUcation No. W089/10134). In 
addition, ohgonucleotides can be modified with hybridization triggered cleavage agents (See, 
e.g., Krol et al., 1988, BioTechniques 6:958-976) or intercalating agents. (See, e,g., Zon, 1988, 

15 Phaim. Res. 5: 539-549). To this end, the oUgonucleotide may be conjugated to another 
molecule, e.g., a peptide, a hybridization triggered cross-linking agent, a transport agent, a 
hybridization-triggered cleavage agent, etc. 

NOV-X Polypeptides 

20 A NOV-X polypeptide of the invention includes the NOV-X-Uke protein whose 

sequence is provided in SEQ ID NO: 2, 4, 5, 7, 9, 11, 13, 15, 17, 19, 21, or 23. The invention 
also includes a mutant or variant protein any of whose residues may be changed &om the 
corresponding residue shown in SEQ ID NO: 2, 4, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, or 23 while 
still mcoding a protein that maintains its NOV-X-like activities and physiological functions, 

25 or a functional firagment thereof In some embodiments, up to 20% or more of the residues 
may be so changed in the mutant or variant protein. In some embodiments, the NOV-X 
polypq)tide according to the invention is a mature polypeptide. 

In general, a NOV-X -like variant that preserves NOV-X-like function includes any 
variant in which residues at a particular position in the sequence have been substituted by 

30 other amino acids, and further include the possibihty of insertmg an additional residue or 

residues between two residues of the parent protein as well as the possibility of deleting one or 
more residues &om the parent sequence. Any amino acid substitution, insertion, or deletion is 
encompassed by the invention. In favorable circumstances, the substitution is a conservative 
substitution as defined above. 
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One aspect of the invention pertains to isolated NOV-X proteins, and biologically 
active portions ttiereo^ or derivatives, fiagments, analogs or homologs thereof. Also provided 
are polypeptide fragments suitable for use as immunogens to raise anti-NOV-X antibodies. In 
one embodiment, native NOV-X proteins can be isolated from cells or tissue sources by an 
appropriate purification scheme using standard protem purification techniques. In another 
embodiment, NOV-X proteins are produced by recombinant DNA techniques. Alternative to 
recombinant e>q)ression, a NOV-X protem or polypeptide can be synthesized chemically using 
standard peptide synthesis techniques. 

An "isolated" or "purified" protein or biologically active portion thereof is substantially 
free of cellular material or other contaminating protems from the cell or tissue source from 
which the NOV-X protein is derived, or substantially free from chemical precursors or other 
chemicals when chemically synthesized. The language "substantially free of cellular material" 
includes preparations of NOV-X protein in which the protein is separated from cellular 
components of the cells from which it is isolated or recombinantly produced. 

In one embodiment, the language "substantially free of cellular material" includes 
preparations of NOV-X protein having less than about 30% (by dry weight) of non-NOV-X 
protein (also referred to herein as a "contaminating protein"), more preferably less than about 
20% of non-NOV-X protein, stiU more preferably less than about 10% of non-NOV-X protein, 
and most preferably less than about 5% non-NOV-X protein. When the NOV-X protem or 
biologically active portion thereof is recombinantly produced, it is also preferably 
substantially free of culture medium, i.e., culture medium represents less than about 20%, 
more preferably less than about 10%, and most preferably less than about 5% of the volume of 
the protein preparation. 

The language "substantially free of chemical precursors or oflier chemicals" includes 
preparations of NOV-X protein in which the protein is separated from chemical precursors or 
other chemicals that are involved in the synthesis of the protem. In one embodiment, tiie 
language "substantially free of chemical precursors or other chemicals" mcludes preparations 
of NOV-X protein having less than about 30% (by dry weigjit) of chemical precursors or 
non-NOV-X chemicals, more preferably less than about 20% chemical precursors or 
non-NOV-X chemicals, still more preferably less than about 10% chemical precursors or 
non-NOV-X chemicals, and most preferably less than about 5% chemical precursors or 
non-NOV-X chemicals. 

Biologically active portions of a NOV-X protein include peptides comprising amino 
acid sequences sufficientiy homologous to or derived from the amino acid sequence of the 
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NOV-X proteiB, e.g., the amino acid sequence shown in SEQ ID NO: 2, 4, 5, 7, 9, 1 1, 13, 15, 
17, 19, 21, or 23 that include fewer amino acids than the fiill length NOV-X proteins, and 
exhibit at least one activity of a NOV-X pioteiiL Typically, biologically active portions 
con^)rise a domain or motif with at least one activity of the NOV-X protein. A biologically 
active portion of a NOV-X protein can be a polypeptide which is, for example, 10, 25, 50, 100 
or more amino acids in length. 

A biologically active portion of aNOV-X protein of the present invention may contain 
at least one of the above-identified domains conserved between the NOV-X protems, e.g. 
TSR modules. Moreover, other biologically active portions, in which other regions of the 
protein are deleted, can be prepared by recombinant techniques and evaluated for one or more 
of the functional activities of a native NOV-X protein. 

In an embodiment, the NOV-X protem has an amino acid sequence shown in SEQ ID 
NO: 2, 4, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, or 23. In other embodiments, the NOV-X protein is 
substantiaUy homologous to SEQ ID NO: 2, 4, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, or 23 and retains 
flie functional activity of flie protein of SEQ ID NO: 2, 4, 5, 7, 9, 11, 13, 15, 17, 19, 21, or 23 
yet differs in amino acid sequence due to natural allelic variation or mutagenesis, as described 
in detail below. Accordingly, in another embodiment, the NOV-X protein is a protein that 
comprises an amino acid sequence at least about 45% homologous to the amino acid sequence 
of SEQ ID NO: 2, 4, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, or 23 and retains flie functional activity of 
flie NOV-X protehis of SEQ ID NO: 2, 4, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, or 23. 

Determining homology between two or more sequence 

To determine the percent homology of two amino add sequences or of two nucleic 
acids, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced 
in either of the sequences being compared for optimal alignment between the sequences). The 
amino add residues or nucleotides at corresponding amino add positions or nucleotide 
positions are then conapared. When a position in flie first sequence is occi5)ied by the same 
amino add residue or nucleotide as the corresponding position in the second sequence, then 
the molecules are homologous at that position (i.e., as used herein amino add or nucldc acid 
"homology" is equivalent to amino add or nucldc acid "identity"). 

The nucldc add sequence homology may be determined as the degree of identity 
between two sequences. The homology may be determined using computer programs known 
in the art, such as GAP software provided in the GCG program package. See, Needleman and 
Wunsch 1970 J Mol Biol 48: 443-453. Using GCG GAP software wifli flie following settings 
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for nucleic acid sequence comparison: GAP creation penalty of 5.0 and GAP extension 
penalty of 0.3, the coding region of the analogous nucleic acid sequences referred to above 
exhibits a degree of identity preferably of at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 
99%, with the CDS (encoding) part of the DNA sequence shown in SEQ ID NO: 1, 3, 6, 8, 
5 10, 12, 14, 16, 18, 20, 22, or 57. 

The term "sequence identity** refers to the degree to which two polynucleotide or 
polypeptide sequences are identical on a residue-by-residue basis over a particular region of 
compa[rison. The term "percentage of sequence identity" is calculated by comparing two 
optimally aUgned sequences over that region of comparison, determining the number of 

10 positions at which the identical nucleic add base (e.g.. A, T, C, G, U, or I, in the case of 

nucleic acids) occurs in both sequences to yield the nmnber of matched positions, dividing the 
number of matched positions by the total number of positions in the region of comparison (i.e,, 
the window size), and multiplying the result by 100 to yield the percentage of sequence 
identity. The term "substantial identity" as used herein denotes a characteristic of a 

1 5 polynucleotide sequence, wherein the polynucleotide comprises a sequence that has at least 80 
percent sequence identity, preferably at least 85 percent identity and often 90 to 95 percent 
sequence identity, more usually at least 99 percent sequence identity as con^ared to a 
reference sequence over a comparison region. The term '^percentage of positive residues" is 
calculated by comparing two optimally aligned sequences over that region of comparison, 

20 determining the number of positions at which the identical and conservative amino acid 
substitutions, as defined above, occur in both sequences to yield the number of matched 
positions, dividing the number of matched positions by the total number of positions in the 
region of comparison (i.e., the window size), and multiplying the result by 100 to yield the 
percentage of positive residues. 

25 

Chimeric and fusion proteins 

The invention also provides NOV-X chimeric or fusion protems. As used herein, a 
NOV-X "chimeric protem" or "fiision protein" comprises a NOV-X polypeptide operatively 
linked to a non-NOV-X polypeptide. An "NOV-X polypeptide" refers to a polypeptide having 
30 an amino acid sequence corresponding to NOV-X, whereas a "non-NOV-X polypeptide" 
refers to a polypeptide having an amino acid sequence corresponding to a protein that is not 
substantially homologous to the NOV-X protein, e.g., a protein that is diflEerent fix)m the 
NOV-X protein and that is derived from the same or a different organism. Within a NOV-X 
fusion protem the NOV-X polypeptide can correspond to all or a portion of a NOV-X protein. 
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Id one embodimait, a NO V-X fusion protein comprises at least one biologically active portion 
of a NOV-X protein. In another embodiment, a NOV-X fusion protein comprises at least two 
biologically active portions of a NOV-X protein. Wifliin the fusion protein, the term 
"operatively linked" is intended to indicate that the NOV-X polypeptide and the non-NOV-X 
5 polypeptide are fiised in-frame to each other. The non-NOV-X polypeptide can be fused to 
the N-tenninus or C-terminus of the NOV-X polypeptide. 

For example, in one embodiment a NOV-X fusion protein comprises a NOV-X 
polypeptide operably linked to the extracellular domain of a second protein. Such fusion 
proteins can be further utilized in screening assays for compounds that modulate NOV-X 

1 0 activity (such assays are described in detail below). 

hx another embodiment, the fusion protein is a GST-NOV-X fusion protein in which 
the NOV-X sequences are fused to the C-terminus of the GST (i.e., glutathione S-transferase) 
sequences. Such fusion proteins can facilitate the purification of recombinant NOV-X. 

In another embodiment, the fusion protein is a NOV-X-immunoglobulin fiision protein 

15 in which the NOV-X sequences comprising one or more domains are fiised to sequences 

derived from a member of the unmunoglobulin protein femily. The NOV-X-immunoglobulin 
fiision proteins of the invention can be incorporated into pharmaceutical compositions and 
administered to a subject to inhibit an interaction between a NOV-X ligand and a NOV-X 
protein on the sur&ce of a cell, to thereby suppress NOV-X-mediated signal transduction in 

20 vivo. In one nonlimiting example, a contemplated NOV-X ligand of the invention is the 
NOV-X receptor. The NOV-X-immunoglobulin fusion proteins can be used to affect the 
bioavailability of a NOV-X cognate ligand. Inhibition of the NOV-X ligand/NOV-X 
interaction may be useful therapeutically for both the treatment of proliferative and 
differentiative disorders, e,g,, cancer as well as modulating (e.g., promoting or inhibiting) cell 

25 survival, as well as acute and chronic inflammatory disorders and hyperplastic wound healing, 
e.g, hypertrophic scars and keloids. Moreover, the NOV-X-immimoglobulin fusion proteins 
of the invention can be used as inamunogens to produce anti-NOV-X antibodies in a subject, to 
purify NOV-X Hgands, and in screening assays to identify molecules that inhibit the 
interaction of NOV-X witii a NOV-X ligand, 

30 A NOV-X chimeric or fusion protein of the invention can be produced by standard 

recombinant DNA techniques. For example, DNA firagments coding for the different 
polypeptide sequences are hgated together in-frame in accordance with conventional 
techniques, e.g., by employing blunt-ended or stagger-ended termini for Ugation, restriction 
en2yme digestion to provide for ^propriate termini, filling-in of cohesive ends as ^propriate. 
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alkaline phosphatase treatment to avoid undesirable joining, and enzymatic ligation. In 
another embodiment, the fiision gene can be synthesized by conventional techniques including 
automated DNA syntiiesizers. Alternatively, PGR amplification of gene fragments can be 
carried out using anchor primers that give rise to complementary overhangs between two 
S consecutive gene fragments that can subsequently be annealed and reamplified to generate a 
chimeric gene sequence (see, for example, Ausubel et al. (eds.) Current Protocols in 
Molecular Biology, John Wiley & Sons, 1992). Moreover, noiany expression vectors are 
commercially available that already encode a fusion moiety (e.g., a GST polypq)tide). A 
NOV-X-encoding nucleic acid can be cloned into such an expression vector such that the 
1 0 fiision moiety is linked in-frame to the NO V-X protein. 



NOV-X agonists and antagonists 

The present invention also pertains to variants of the NOV-X proteins that fimction as 
either NOV-X agonists (mimetics) or as NOV-X antagonists. Variants of the NOV-X protein 

15 can be generated by mutagenesis, e.g., discrete point mutation or truncation of the NOV-X 

protein. An agonist of the NOV-X protein can retain substantially the same, or a subset oi^ the . 
biological activities of the naturally occurring form of the NOV-X protein. An antagonist of 
the NOV-X protein can inhibit one or more of the activities of the naturally occurring form of 
the NOV-X protein by, for example, conq)etitively binding to a downstream or upstream 

20 member of a cellular signaling cascade which includes the NOV-X protein. Thus, specific 
biological eflFects can be elicited by treatment with a variant of limited fimction. In one 
embodiment, treatment of a suhject with a variant having a subset of the biological activities 
of the naturally occurring form of the protein has fewer side effects in a subject relative to 
treatment with the naturally occurring form of the NOV-X protems. 

25 Variants of the NOV-X protein that fimction as either NOV-X agonists (mimetics) or 

as NOV-X antagonists can be identified by screening combinatorial libraries of mutants, e.g., 
truncation mutants, of tiie NOV-X protein for NOV-X protein agonist or antagonist activity. 
In one embodiment, a variegated library of NOV-X variants is generated by combinatorial 
mutagenesis at the nucleic acid level and is encoded by a variegated gene library. A 

30 variegated library of NOV-X variants can be produced by, for example, enzymatically Ugating 
a mixture of synthetic oligonucleotides into gene sequences such that a degenerate set of 
potential NOV-X sequences is expressible as individual polypeptides, or alternatively, as a set 
of larger fiision proteins (e.g., for phage display) containing the set of NOV-X sequences 
thereiiL There are a variety of methods which can be used to produce Ubraries of potential 
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NOV-X variants fiom a degenerate oligonucleotide sequence. Chemical synthesis of a 
degenerate gene sequence can be performed in an automatic DNA synthesizer, and the 
synthetic gene then ligated into an q^propriate expression vector. Use of a degenerate set of 
genes allows for the provision, in one mixture, of all of the sequences encoding the desired set 
5 of potential NOV-X sequmces. Methods for synthesizing degenerate oligonucleotides are 
known in the art (see, e.g., Narang (1983) Tetrahedron 39:3; Itakura et al. (1984) Annu Rev 
Biochem 53:323; Itakura et al. (1984) Science 198:1056; Ike et al. (1983) Nucl Acid Res 
11:477. 

1 0 Polypeptide libraries 

In addition, libraries of fiagments of the NOV-X protein coding sequence can be used 
to generate a variegated population of NOV-X fragments for screening and subsequent 
selection of variants of a NOV-X protein. In one embodiment, a library of coding sequence 
ftagments can be generated by treating a double stranded PGR fragment of a NOV-X coding 

1 5 sequence with a nuclease under conditions wherein nicking occurs only about once per 

molecule, denaturing the double stranded DNA, renaturing the DNA to form double stranded 
DNA that can include sense/antisense pairs from different nicked products, removing single 
stranded portions from reformed duplexes by treaWent with SI nuclease, and ligating the 
resulting fragment library into an expression vector. By this method, an e)q)ression library can 

20 be derived which encodes N-terminal and internal fragments of various sizes of the NOV-X 
proteuL 

Several techniques are known in the art for screening gene products of combinatorial 
libraries made by point mutations or truncation, and for screening cDNA libraries for gene 
products having a selected property. Such techniques are ad^table for r^id screening of the 

25 gene libraries generated by tiie combinatorial mutagenesis of NOV-X protems. The most 
widely used techniques, which are amenable to high througtq)ut analysis, for screening large 
gene libraries typically include cloning the gene hbrary into replicable expression vectors, 
transforming ^ropriate cells with the resulting library of vectors, and exptessing the 
combinatorial genes under conditions in which detection of a desired activity &cilitates 

30 isolation of flhe vector encoding the gene whose product was detected. Recrusive ensemble 
mutagenesis (REM), a new technique that enhances the frequency of fimctional mutants in the 
libraries, can be used in combination with the screening assays to identify NOV-X variants 
(Arkin and Yourvan (1992) PNAS 89:781 1-7815; Delgrave et al. (1993) Protem Engineering 
6:327-331). 
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NOV-X Antibodies 

Also included in the invention are antibodies to NOV-X proteins, or fragments of 
NOV-X proteins. The term "antibody" as used herein refers to immunoglobulin molecules 
5 and immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that 
contain an antigen binding site that specifically binds (immunoreacts with) an antigen. Such 
antibodies include, but are not limited to, polyclonal, monoclonal, chimeric, single chain. Fab, 
Fab- and F(ab72 fi:agments, and an Fab expression library. In general, an antibody molecule 
obtained from humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ 

1 0 from one another by the nature of ttie heavy chain present in the molecule. Certain classes 
have subclasses as well, such as IgGi, IgGi, and others. Furthermore, in humans, the Ught 
chain may be a kappa chain or a lambda chain. Reference herein to antibodies includes a 
reference to all such classes, subclasses and types of human antibody species. 

An isolated NOV-X-related protein of the invention may be intended to serve as an 

15 antig^ or a portion or fragment thereo:^ and additionally can be used as an immunogen to 
generate antibodies that immunospedfically bind the antigen, using standard techniques for 
polyclonal and monoclonal antibody preparation. The fiill-length protein can be used or, 
alternatively, the invention provides antigenic peptide fragments of the antigen for use as 
immunogens. An antigenic peptide fragment comprises at least 6 amino acid residues of the 

20 amino acid sequence of the fiill length protein, such as an amino acid sequence shown in SEQ 
ID NO: 2, 4, 6 , 8 ,10, 12, 14, 16, 18, or 20, and enconq)asses an epitope thereof such that an 
antibody raised against the pq)tide forms a specific inomune complex with the fiiU length 
protein or with any firagment that contains the epitope. Preferably, the antigenic peptide 
comprises at least 10 amino add residues, or at least 15 amino acid residues, or at least 20 

25 amino acid residues, or at least 30 amino acid residues. Preferred epitopes encompassed by 
the antigenic peptide are regions of the protein fliat are located on its surface; commonly these 
are hydrophiUc regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
antigenic peptide is a region of NOV-X-related protein that is located on the surface of the 

30 protem, e.g., a hydrophilic region. A hydrophobicity analjrsis of the human NOV-X-related 
protein sequence will indicate which regions of a NOV-X-related protem are particularly 
hydrophilic and, therefore, are likely to encode surface residues usefid for targeting antibody 
production. As a means for targeting antibody production, hydropatfiy plots showing regions 
of hydrophihcity and hydrophobicity may be generated by any method well known in the art, 
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including, for example, the Kyte Doolittle or the Hopp Woods methods, either with or without 
Fourier transformation. See, e.g., Hopp and Woods, 1981, Proc. Nat Acad. Sci. USA 78: 
3824-3828; Kyte and DooUtfle 1982, J. Mol. Biol. 157: 105-142, each of which is incorporated 
herein by reference in its entirety. Antibodies that are specific for one or more domains within 
5 an antigenic protein, or derivatives, fragments, analogs or homologs thereof, are also provided 
herein. 

A protein of the invention, or a derivative, fi:agment, analog, homolog or ortfaolog 
thereof, may be utilized as an immunogen in the generation of antibodies that 
immunospedfically bind these protein conq)oneats. 
1 0 Various procedures known within the art may be used for the production of polyclonal 

or monoclonal antibodies directed against a protein of the invention, or against derivatives, 
fi-agments, analogs homologs or orthologs thereof (see, for example. Antibodies: A Laboratory 
Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, NY, incorporated herein by reference). Some of these antibodies are discussed below. 

15 

Polyclonal Antibodies 

For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, 
goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synthetic variant thereof, or a derivative of the foregoing. An qspropriate 

20 inmiunogenic preparation can contain, for example, the naturally occurring itmnunogecdc 
protein, a chemically synthesized polypeptide representing the immunogenic protein, or a 
recombinantiy expressed immunogenic proteiiL Furfliermore, the protein may be conjugated 
to a second protein known to be immunogenic in the mammal being immunized. Examples of 
such immunogenic proteins include but are not limited to keyhole limpet hemocyanin, serum 

25 albumin, bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can further 

include an adjuvant. Various adjuvants used to increase the immunological response include, 
but are not limited to, Freuntfs (complete and inconq)lete), mineral gels (e.g., aluminum 
hydroxide), surface active substances (e.g., lysolecithin, pluronic polyols, polyanions, 
peptides, oil emulsions, dinitrophenol, etc.), adjuvants usable in humans such as Bacille 

30 Calmette-Guerin and Corynebacterium parvum, or similar immunostimulatory agents. 
Additional examples of adjuvants which can be employed include MPL-TDM adjuvant 
(monophosphoryl Lipid A, synthetic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated firom the mammal (e.g., from the blood) and further purified by well known 
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techniques, such as afBnity chromatogr^hy using protein A or protein G, which provide 
primaiily the IgG fiaction of immune serum. Subsequently, or alternatively, the specific 
antigen which is the target of the immunoglobulin sought, or an epitope thereof; may be 
immobilized on a column to pmify the immune specific antibody by immimoafiGnity 
5 chromatography. Purification of immunoglobulins is discussed, for example, by D. Wilkinson 
(The Scientist, published by The Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 (April 17, 
2000), pp. 25-28). 



Monoclonal Antibodies 

10 The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as 

used herein, refers to a population of antibody molecules that contain only one molecular 
species of antibody molecule consisting of a unique light chain gene product and a unique 
heavy chain gene product In particular, fho conq>lementarity determining regions (CDRs) of 
the monoclonal antibody are identical in all the molecules of the population. MAbs thus 

1 5 contain an antigen binding site capable of immmioreacting with a particular epitope of the 
antigen characterized by a unique binding afiBnity for it 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 
described by Kohler and Milstein, Nature. 256:495 (1975). In a hybridoma method, a mouse, 
hamster, or other appropriate host animal, is typically immunized with an immunizing agent to 

20 eUcit lymphocytes that produce or are capable of producing antibodies that will specifically 
bind to tiie immunizing agent Alternatively, (he lynq)hocytes can be immunized in vitro. 

The immunizing agent will typically include the protein antigen, a fragment thereof or 
a fiision protein thereof. Generally, either peripheral blood lymphocytes are used if cells of 
human origin are desired, or spleen cells or lymph node cells are used if non-human 

25 m am mal i a n sources are desired. The lymphocytes are then fiised with an inamortalized cell 
line using a suitable fusing agent, such as polyethylene glycol, to form a hybridoma cell 
(Goding, Monoclonal Antibodies: Principles and Practice. Academic Press, (1986) pp. 59- 
103). Immortalized cell lines are usually transformed mammahan Cells, particularly myeloma 
cells of rodent, bovine and human origin. Usually, rat or mouse myeloma cell lines are 

30 enq)loyed. The hybridoma cells can be cultured in a suitable culture medium that preferably 
contains one or more substances that inhibit the growth or survival of the unfiised, 
immortalized cells. For example, if the parental cells lack the enzyme hypoxanfhine guanine 
phosphoribosyl transferase (HGPRT or HPRT), the culture medium for the hybridomas 
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typically will include hypoxanthine, aminopterm, and thymidine ("HAT medium"), which 
substances prevent the growth of HGPRT-deficient cells. 

Preferred iimnoitalized cell lines are those that fuse eflficiently, support stable high 
level expression of antibody by the selected antibody-producing cells, and are sensitive to a 
medium such as HAT medium. More preferred immortalized ceU lines are murine myeloma 
lines, which can be obtained, for instance, from the Salk Institute Cell Distribution Center, San 
Diego, California and the American Type Culture Collection, Manassas, Virgmia. Human 
myeloma and mouse-human heteromyeloma cell lines also have been described for the 
production of human monoclonal antibodies (Kozbor, J. TmTmmnl , 133:3001 (1984); Brodeur 
et al.. Monoclona l Antibodv Production T echniquftg and Applications. Marcel Dekker, Mc, 
New York, (1987) pp. 51-63). 

The culture medium in which the hybridoma cells are cultured can then be assayed for 
the presence of monoclonal antibodies directed against the antigen. Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is detennined by 
immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or 
enzyme-linked immunoabsorbent assay (ELISA). Such techniques and assays are known in 
the art The binding afSnity of the monoclonal antibody can, for example, be detemiined by 
the Scatchard analysis of Munson and Pollard. Anal. Biochem.. 107 :220 (1980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target antigen 
are isolated. 

After the desired hybridoma cells are identified, the clones can be subcloned by 
limiting dilution procedures and grown by standard methods. Suitable culture media for this 
purpose include, for exanq)le, Dulbecco's Modified Eagle's Medium and RPMI-1640 medium. 
Alternatively, the hybridoma cells can be grown iv vivo as ascites in a mammal. 

The monoclonal antibodies secreted by the subclones can be isolated or purified fix)m 
the culture medium or ascites fluid by conventional immunoglobulin purification procedures 
such as, for example, protein A-Sepharose, hydroxylapatite chromatogr^hy, gel 
electrophoresis, dialysis, or afGnity chromatography. 

The monoclonal antibodies can also be made by recombioant DNA methods, such as 
those described in U.S. Patent No. 4,816,567. DNA encoding flie monoclonal antibodies of 
the inv^tion can be readily isolated and sequenced using conventional procedures (e.g., by 
using oligonucleotide probes that are capable of binding specifically to genes encoding the 
heavy and light chains of murine antibodies). The hybridoma cells of the invention serve as a 
preferred source of such DNA. Once isolated, the DNA can be placed into expression vectors. 
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which are then transfected into host cells such as sinrian COS cells, Chinese hamster ovary 
(CHO) cells, or myeloma cells that do not otherwise produce immunoglobulin protein, to 
obtain the synthesis of monoclonal antibodies in the recombinant host cells. The DNA also 
can be modified, for example, by substituting the coding sequence for human heavy and light 
chain constant domains in place of the homologous murine sequences (U.S. Patent No. 
4,816,567; Morrison, Nature 368, 812-13 (1994)) or by covalently joining to the 
immunoglobulin coding sequence all or part of the coding sequence for a non-immunoglobulin 
polypeptide. Such a non-immunoglobulin polypeptide can be substituted for the constant 
domains of an antibody of the invention, or can be substituted for the variable domains of one 
antigen-combining site of an antibody of the invention to create a chimeric bivalent antibody. 

Humanized Antibodies 

The antibodies directed against the protein antigens of the invention can further 
comprise humanized antibodies or human antibodies. Th^e antibodies are suitable for 
administration to humans without engendering an immune response by the human against the 
administered immunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, 
immunoglobulin chains or fi:agments thereof (such as Fv, Fab, Fab', F(ab')2 or other antigen- 
binding subsequences of antibodies) that are principally comprised of the sequence of a human 
immunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. 

Humanization can be performed following the method of Winter and co-workers 
(Jones et al.. Nature, 321:522-525 (1986); Riechmann et al.. Nature, 332:323-327 (1988); 
VeAoeyen et al,. Science, 239:1534-1536 (1988)), by substituting rodent C3DRs or CDR 
sequences for the corresponding sequences of a human antibody. (See also U.S. Patent No. 
5^225,539.) In some instances, Fv framework residues of the human unmunoglobulin are 
replaced by corresponding non-human residues. Humanized antibodies can also comprise 
residues which are found neither in the recipient antibody nor in the imported CDR or 
framework sequences. In general, the humanized antibody will comprise substantially all of at 
least one, and typically two, variable domains, in which all or substantially all of the CDR 
regions correspond to those of a non-human immunoglobulin and aU or substantially all of the 
framework regions are those of a human immunoglobulin consensus sequence. The 
humanized antibody optimally also will comprise at least a portion of an immunoglobulin 
constant region (Fc), typically that of a human immunoglobulin (Jones et al., 1986; 
Riechmann et al., 1988; and Presta, Curr. Op. Stmct Biol.. 2:593-596 (1992)). 
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Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentially the entire 
sequences of both the light chain and the heavy chain, including the CDRs, arise fiom human 
genes. Such antibodies are termed *Tiuman antibodies'*, or "fully human antibodies" herein. 
Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
hybridoma technique (see Kozbor, et al., 1983 Immunol Today 4: 72) and the EBV hybiidoma 
technique to produce human monoclonal antibodies (see Cole, et al,, 1985 In: MONOCLONAL 
Antibodies and Cancer Therapy, Alan IL Liss, lac, pp. 77-96). Human monoclonal 
antibodies may be utilized in the practice of the present invention and may be ptoduced by 
using human hybridomas (see Cote, et al., 1983. Proc Nati Acad Sci USA 80: 2026-2030) or 
by transfomaing human B-cells with Epstein Batr Virus in vitro (see Cole, et al., 1985 In: 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

In addition, human antibodies can also be produced using additional techniques, 
including phage display htraries (Hoogenboom and Winter, X Mol. BioL, 227:381 (1991); 
Maries et al., J. Mol. BioL, 222:581 (1991)). Similarly, human antibodies can be made by 
introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated, t^n 
challenge, human antibody production is observed, which closely resembles that seen in 
humans in all respects, including gene rearrangement, assembly, and antibody repertoire. This 
^proach is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 
5,625,126; 5,633,425; 5,661,016, and in Marks et al. (Bio/Iechnologv 10, 779-783 (1992)); 
Lonberg et al. (Nature 368 856-859 (1994)); Morrison ( Nature 368, 812-13 (1994)); Fishwild 
et al,( Nature Biotechnolopv 14, 845.51 (1996)); Neuberger flSTature Biotechnology 14, 826 
(1996)); and Lonberg and Huszar Ontem. Rev. TniTmiTinl 13 65-93 (1995)), 

Hunian antibodies may additionaUy be produced using transgenic nonhuman a^ 
which are modified so as to produce fully human antibodies rather than the animal's 
endogenous antibodies in response to challenge by an antigen. (See PCT publication 
WO94/02602), The endogenous genes encoding the heavy and light immunoglobulin chains in 
the nonhuman host have been mcapacitated, and active loci encoding human heavy and Ught 
chain inununoglobuUns are inserted into the host's genome. The human genes are 
incorporated, for example, using yeast artificial chromosomes containing the requisite human 
DNA segments. An animal which provides all the desired modifications is then obtained as 
progeny by crossbreeding intermediate transgenic animals containing fewer than tiie fiiU 
complement of the modifications. The preferred embodiment of such a nonhuman animal is a 
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mouse, and is termed the Xenomouse™ as disclosed in PCX publications WO 96/33735 Tand 
WO 96/34096, This animal produces B cells which secrete fiilly human immunoglobulms. 
The antibodies can be obtained directly from the animal after immunization with an 
immunogen of interest, as, for example, a preparation of a polyclonal antibody, or alternatively 
5 fiom immortalized B cells derived from the animal, such as hybiidomas producing 

monoclonal antibodies. Additionally, the genes encoding the inununoglobulins with human 
variable regions can be recovered and e?qpressed to obtain the antibodies directly, or can be 
further modified to obtain analogs of antibodies such as, for example, single chain Fv 
molecules* 

10 An example of a method of produciog a nonhuman host, exemplified as a mouse, 

lacking e}q)ression of an endogenous immunoglobulin heavy chain is disclosed in U.S. Patent 
No, 5,939,598. It can be obtained by a method including deleting tiie J segment genes from at 
least one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of 
the locus and to prevent formation of a transcript of a rearranged immunoglobulin heavy chain 

15 locus, the deletion being eJSected by a targeting vector containing a gene encoding a selectable 
marker; and producing fix)m the embryonic stem cell a transgenic mouse whose somatic and 
germ cells contain the gene encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is disclosed 
in U.S. Patent No. 5,91 6,771 . It includes introducing an expression vector that contains a 

20 nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, 

introducing an expression vector containing a nucleotide sequence encoding a light chain into 
another naanmialian host cell, and fiising the two cells to form a hybrid ceU^ The hybrid cell 
expresses an antibody containing the heavy chain and the light chain. 

In a further improvraient on this procedure, a method for identifying a clinically 

25 relevant epitope on an immunogen, and a correlative method for selecting an antibody that 
binds immunospecifically to llie relevant q>itope with high affinity, are disclosed in PCT 
publication WO 99/53049. 

Fab Fragments and Single Chain Antibodies 
30 According to the invention, techniques can be adapted for the production of 

single-chain antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent 
No. 4,946,778). In addition, methods can be adapted for the construction of Fab expression 
libraries (see e.g., Huse, et al., 1989 Science 246: 1275-1281) to allow rapid and effective 
identification of monoclonal Fab fiagments with the desired specificity for a protein or 
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derivatives, fragments, analogs or homologs thereof. Antibody fragments that contain the 
idiotypes to a protein antigen may be produced by techniques known in the art including, but 
not limited to: (i) an F(ab')2 fragment produced by pepsin digestion of an antibody molecule; (ii) 
an Fab fragment generated by reducing the disulfide bridges of an 'P(a>^ fragment; (iii) an Fab 
5 fragment generated by the treatment of the antibody molecule with papain and a reducing 
agent and (iv) Fy fragments. 

Bispecific Antibodies 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that 

10 have binding specificities for at least two different antigens. In the present case, one of the 
binding specificities is for an antigenic protein of the invention. The second binding target is 
any other antigen, and advantageously is a cell-surface protein or receptor or receptor subunit. 

Methods for making bispecific antibodies are known in the art. Traditionally, the 
recombinant production of bispecific antibodies is based on the co-e)q>r^sion of two 

1 5 immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have difTerent 
specificities (Milstem and Cuello, Nature, 305:537-539 (1983)), Because of the random 
assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) produce 
a potential mixture of ten different antibody molecules, of which only one has the correct 
bispecific structure. The purification of the correct molecule is usually accomplished by 

20 aflEinity chromatogr^hy steps. Similar procedures are disclosed in WO 93/08829, published 
13 May 1993, and in Traunecker et aL, 1991 EMBO J., 10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 
combining sites) can be fiised to immunoglobului constant domain sequences. The fiision 
preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part 

25 of the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant 

region (CHI) containing the site necessary for Hght-chain binding present in at least one of the 
fiisions. DNAs encoding the immunoglobulin heavy-chain fiisions and, if desired, the 
immimoglobulin light chain, are inserted into separate e)q)ression vectors, and are co- 
transfected into a suitable host organism. For fiuiber details of generating bispecific 

30 antibodies see, for example, Suresh et aL, Methods in Enzvmologv, 121 :210 (1986). 

According to another approach described in WO 96/2701 1, the interface between a pair 
of antibody molecules can be engineered to maximize the percentage of heterodimers which 
are recovered firom recombinant cell culture. The preferred interface comprises at least a part 
of the CH3 region of an antibody constant domain. In this method, one or more small anuno 
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acid side chains fiom the interface of the first antibody molecule are replaced with larger side 
chains (e.g. tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the 
large side chain(s) are created on the interface of the second antibody molecule by replacing 
large amino acid side chains with smaller ones (e,g. alanine or threonine). This provides a 
mechanism for increasing ftie yield of the heterodimer over other unwanted end-products such 
as homodimers. 

Bispecific antibodies can be prepared as full length antibodies or antibody j&agments (e.g. 
F(ab')2 bispecific antibodies). Techniques for generating bispecific antibodies &om antibody 
fi:agments have been described in the literature. For example, bispecific antibodies can be 
prepared using chemical Unkage. Brennan et al., Science 229:81 (1985) describe a procedure 
wherein mtact antibodies are proteolytically cleaved to generate F(ab')2 firagments. These 
fiagments are reduced in the presence of the dithiol complexing agent sodium arsenite to 
stabilize vicinal dithiols and prevent intennolecular disulfide formation. The Fab' Augments 
generated are then converted to tfaionitrobenzoate (TNB) derivatives. One of the Fab'-TNB 
derivatives is then reconverted to the Fab'-fhiol by reduction with mercq)toethylamine and is 
mixed with an equimolar amount of flie other Fab'-TNB derivative to form the bispecific 
antibody. The bispecific antibodies produced can be used as agents for the selective 
immobilization of enzymes. 

Additionally, Fab' Augments can be directly recovered &om E. coli and chemically 
coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 175:217-225 (1992) 
describe the production of a fiiUy humanized bispecific antibody F(ab')2 molecule. Each Fab' 
fi:agment was separately secreted &om E. coli and subjected to dirwted chemical coupling in 
vitro to form the bispecific antibody. The bispecific antibody thus formed was able to bmd to 
cells overexpressing flhe EibB2 receptor and normal human T cells, as well as trigger the lytic 
activity of human cytotoxic lyn^hocytes against human breast tumor targets. 

Various techniques for making and isolating bispecific antibody fragments directly 
fix)m recombinant cell culture have also been described. For example, bispecific antibodies 
have been produced using leucine zippers. Kostehiy et al., J. Immunol. 148(5):1547-1553 
(1 992). The leucme zipper peptides firom the Fos and Jun proteins were linked to the Fab' 
portions of two different antibodies by gene fijsion. The antibody homodimers were reduced 
at the hinge region to form monomers and then re-oxidized to form the antibody heterodimers. 
This method can also be utilized for the production of antibody homodimere. The "diabod/' 
technology described by Holhnger et al., Proc. Natl, Acad. Sci. USA 90:6444-6448 (1993) has 
provided an alternative mechanism for making bispecific antibody firagments. The fiagments 



124 



wo 01/62928 PCTAJS0iy06151 

comprise a heavy-chain variable domain (ViO comiected to a ligjit-chain variable domain (Vl) 
by a linker which is too short to allow pairing between the two domains on the same chain. 
Accordingly, the Vh and Vl domains of one fragment are forced to pair wifli flie 
complementary Vl and Vh domains of another fragment, thereby forming two antigen-binding 
sites. Another strategy for making bispecific antibo(fy fragments by the use of single-chain Fv 
(sFv) dimers has also been reported. See, Gruber et al., J, Trnmiinoi 152:5368 (1994). 
Antibodies with more than two valencies are contemplated. For exartq)le, trispecific 
antibodies can be prepared. Tutt et aL, J. Tmmimni 147:60 (1991). 

Exenqdary bispecific antibodies can bind to two diffCTent epitopes, at least one of 
which OTiginates in the protein antigen of the invention. Alternatively, an anti-aatigenic aim 
of an fanmunoglobulin molecule can be combined with an arm which binds to a triggering 
molecule on a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7), or 
Fc receptors for IgG (Fc R). such as Fc RI (CD64), Fc Rn (CT)32) and Fc RHI (CD16) so as 
to focus cellular defense medianisms to the cell expressing the particular antigen. Bispecific 
antibodies can also be used to direct cytotoxic agents to cells which express a particular 
antigen. These antfl)odies possess an antigen-binding aim and an arm which binds a cytotoxic 
agent or a radionucUde chelator, such as EOTUBE, DPTA, DOTA, or TETA. Another 
bispecific antibody of interest binds the protein antigen described herein and fiirther binds 
tissue &ctor (TP). 



Heteroconjugate Antibodies 

Heteroconjugate antibodies are also within the scope of the present invention. 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such 
antibodies have, for example, been proposed to target immune system cells to unwanted cells 
(U.S. Patent No. 4,676,980), and for treatment of BDEV infection (WO 91/00360; WO 
92/200373; EP 03089). It is contemplated that the antibodies can be prepared in vitro using 
known methods in synthetic protein chemistry, including those involving crossUnking agents. 
For example, immunotoxins can be constructed using a disulfide exchange reaction or by 
forming a thioether bond. Examples of suitable reagents for this purpose inchide iminolhiolate 
and mefliyl-4-merc^tobutyrimidate and those disclosed, for example, in U.S. Patemt No. 
4,676,980. 

Effector Function Engineering 

It can be desirable to modify the antibody of the invention with respect to efiector 
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function, so as to eahance, e.g., the effectiveness of the antibody in treating cancer. For 
example, cysteine residue(s) can be introduced into the Fc region, thereby allowing interchain 
disulfide bond formation in this region. The homodimeric antibody thus generated can have 
inq)roved internalization capability and/or increased conq)lement-mediated cell kiUmg and 
antibody-dependent cellular cytotoxicity (ADCC). See Caron et al., J. Exp Med., 176: 1191- 
1195 (1992) and Shopes, J. Immunol., 148: 2918-2922 (1992), Homodimeric antibodies with 
enhanced anti-tumor activity can also be prepared using heterobi&nctional cross-linkers as 
described in Wolff et al. Cancer Research, 53: 2560-2565 (1993). Alternatively, an antibody 
can be engineered that has dual Fc regions and can thereby have enhanced complement l)^s 
and ADCC capabilities. See Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

Immunoconjugates 

The invention also pertains to immunoconjugates comprising an antibody conjugated 
to a cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active 
toxin of bacterial, fungal, plant, or animal origin, or Augments thereof, or a radioactive 
isotope (i.e,, aradioconjugate). 

Chemother^utic agents useful in the generation of such immunoconjugates have 
been described above. Enzymatically active toxins and fi:agments thereof that can be used 
mclude diphtheria A chain, nonbinding active Augments of diphtheria toxin, exotoxin A chain 
(from Pseudomonas aerugmosa), ricin A chain, abrin A chain, modeccin A chain, alpha-sarcin, 
Aleurites fordii proteins, dianthin protems, Phytolaca americana proteins (PAPI, P APII, and 
PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria officinalis inhibitor, 
gelonin, mitogelhn, restrictocin, phenomycin, enomycin, and the tricothecenes. A variety of 
xadionucUdes are available for flie production of radioconjugated antibodies. Exan^les 
include ^^^Bi, V and ^»^e. 

Conjugates of the antibody and cytotoxic agent are made using a variety of 
bifunctional protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) propionate 
(SPDP), iminothiolane (IT), bifunctional derivatives of imidoesters (such as dimethyl 
adipimidate HCL), active esters (such as disuccinimidyl suberate), aldehydes (such as 
glutareldehyde), bis-azido compounds (such as bis (p-azidobenzoyl) hexanediamine), bis- 
diazonium derivatives (such as bis-(p-diazoniumbenzoyl)-ethylenediamine), diisocyanates 
(such as tolyene 2,6-diisocyanate), and bis-active fluorine compounds (such as 1,5-difluoro- 
2,4-dinitrobenzene). For example, a ricin immunotoxin can be prepared as described in 
Vitetta et al.. Science, 238: 1098 (1987). Caibon-14-labeled l-isothiocyanatobenzyl-3- 
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methyldiethylene Maminepeataacetic add (MX-DTPA) is an exeaq>lary chelating agent for 
conjugation of radionucleotide to the antibody. See W094/1 1026. 

Li another embodiment, the antibody can be conjugated to a "receptor** (such 
str^tavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is 
administered to the patient, followed by removal of unbound conjugate fiom the circulation 
using a clearing agent and then administration of a "ligand" (e.g., avidin) that is in turn 
conjugated to a cytotoxic agent 

NOV-X Recombinant Expression Vectors and Host Cells 

Another aspect of the invention pertains to vectors, preferably e:iq)ression vectors, 
containing a nucleic acid encoding a NOV-X protein, or derivatives, firagments, analogs or 
homologs thereof As used herein, the term "vector" refers to a nucleic acid molecule enable 
of transporting another nucleic acid to which it has been linked. One type of vector is a 
"plasmid", which refers to a circular double stranded DNA loop mto which additional DNA 
segments can be ligated. Another type of vector is a viral vector, wherein additional DNA 
segments can be Ugated into the viral genome. Certain vectors are capable of autonomous 
replication in a host cell into which they are introduced (e.g., bacterial vectors having a 
bacterial origin of replication and q)isoinal mammalian vectors). Other vectors (e.g., 
non-episomal mammahan vectors) are integrated into the genome of a host cell jxpon 
introduction into the host cell, and thereby are replicated along with the host genome. 
Moreover, certain vectors are enable of directing the expression of genes to which they are 
operatively-linked. Such vectors are referred to herem as "expression vectors". In general^ 
expression vectors of utihty in recombinant DNA techniques are often in the form of plas^nids. 
In the present specification, "plasmid" and "vector" can be used interchangeably as the 
plasmid is the most commonly used form of vector. However, the invention is intended to 
include such other forms of expression vectors, such as viral vectors (e.g., repKcation defective 
retroviruses, adenoviruses and adeno-associated viruses), which serve equivalent functions. 

The recombinant ^ression vectors of the invention coiqprise a nucleic acid of the 
invention in a form suitable for expression of the nucleic acid in a host cell, which means that 
the recombinant e}q)ression vectors include one or more regulatory sequences, selected on the 
basis of the host cells to be used for expression, that is operatively-linked to the nucleic acid 
sequence to be e3q)ressed. Within a recombinant e3q)ression vector, "operably-linked" is 
intended to mean that the nucleotide sequence of interest is linked to the regulatory 
sequence(s) in a manner that allows for e>q)ression of the nucleotide sequence (e.g., in an in 
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vitro transOTption/translation system or in a host cell when the vector is introduced into the 
host cell). 

The term "regulatory sequence" is intended to includes promoters, enhancers and other 
expression control elements (e-g,, polyadenylation signals). Such regulatory sequences are 
described, for example, in Goeddel, Gene Expression Technology: Methods in 
Enzymology 185, Academic Press, San Diego, CaUf. (1990). Regulatory sequences mclude 
those that direct constitutive expression of a nucleotide sequence in many types of host cell 
and those that direct expression of the nucleotide sequence only in certain host cells (e.g., 
tissue-specific regulatory sequences). It will be appreciated by those skilled in the art that flie 
design of the expression vector can depend on such factors as the choice of the host cell to be 
transformed, the level of expression of protein desired, etc. The expression vectors of the 
invention can be introduced into host cells to thereby produce proteins or peptides, including 
fusion proteins or peptides, encoded by nucleic adds as described herein (e.g., NOV-X 
proteins, mutant forms of NOV-X proteins, fusion proteins, ete.). 

The recombinant e3q)ression vectors of the invention can be designed for expression of 
NOV-X proteins in prokaryotic or eukaryotic cells. For example, NOV-X proteins can be 
expressed in bacterial cells such as Escherichia coK, insect ceUs (using baculovirus expression 
vectors) yeast cells or mammahan cells. Suitable host cells are discussed further in Goeddel, 
Gene Expression Technology: Meihods in Enzymology 185, Academic Press, San 
Diego, Calif (1990). Alternatively, flie recombinant expression vector can be transcribed and 
translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase. 

Ejqpression of proteins ia prokaryotes is most often carried out in Escherichia coh with 
vectors containing constitutive or inducible promoters directing the expression of either fusion 
or non-fusion proteins. Fusion vectors add a number of amino adds to a protein encoded 
therem,usuany to the aininotenninusofthe recombinant protein. Such fusion vectors 
typically serve three purposes: (i) to increase expression of recombinant protein; (ii) to 
increase the solubiUty of the recombinant protein; and (iii) to aid in the purification of the 
recombinant protein by acting as a Hgand in aJBBnity purification. Often, in fiision expression 
vectors, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the 
recombinant protein to enable separation of the recombinant protein &om the fusion moiety 
subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition 
sequences, include Factor Xa, thrombin and enterokinase. Typical fiision expression vectors 
include pGEX (Pharmacia Biotech Inc; Smith and Johnson, 1988. Gene 67: 31-40), pMAL 
(New England Biolabs, Beverly, Mass.) and pRTTS (Pharmacia, Piscataway, N. J.) that fiise 
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glutathione S-transferase (GST), maltose E binding protem, or protein A, respectively, to the 
target recombinant protein. 

Examples of suitable inducible non-fusion B. coli expression vectors include pTrc (Amrann et 
al., (1988) Gene 69:301-315) and pET 1 Id (Studier et al.. Gene Expression Technology: 
Methods m Enzymology 185, Academic Press, San Diego, Calif. (1990) 60-89). 

One strategy to maximize recombinant protein e>q)ression in K coli is to express the 
protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant 
protein. See, e.g,, Gottesman, Gene Expression Technology: Methods in Enzymology 
185, Academic Press, San Diego, Calif. (1990) 1 19-128. Another strategy is to alter the 
nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the 
individual codons for each amino acid are those preferentially utilized in E. coli (see, e.g., 
Wada, et al., 1992. NucL Acids Res. 20: 21 1 1-21 18). Such alteration of nucleic acid 
sequences of the invention can be carried out by standard DNA synthesis techniques. 

Jn another embodiment, the NOV-X expression vector is a yeast expression vector. 
Examples of vectors for expression in yeast Saccharomyces cerivisae include pYepSecl 
(Baldaii, et al., 1987. EMBO J. 6: 229-234), pMFa (Kurjan and Herskowitz, 1982. CeU 30: 
933-943), pJRY88 (Schultz et al., 1987. Gene 54: 1 13-123), pYES2 (Invitrogen Corporation, 
San Diego, Calif), and picZ (InVitrogen Corp, San Diego, Calif.). 

Alternatively, NOV-X can be expressed in insect ceUs using baculovirus expression 
vectors. Baculovirus vectors available for expression of proteins in cultured insect cells (e.g., 
SF9 cells) include the pAc series (Smith, et aL, 1983. Mol. CeU. Biol. 3: 2156-2165) and the 
pVL series (Lucldow and Summers, 1989. Virology 170: 31-39). 

In yet another embodiment, a nucleic acid of the invention is expressed in mammalian cells 
using a m a m malian expression vector. Examples of mammalian expression vectors include 
pCDM8 (Seed, 1987. Nature 329: 840) andplvmPC (Kauflnan, et al., 1987. EMBO J. 6: 
187-195). When used in m a mmalian cells, the e?q)ression vector's control functions are often 
provided by viral regulatory elements. For example, commonly used promoters are derived 
from polyoma, adenovirus 2, cytomegalovirus, and simian virus 40. For other suitable 
expression systems for both prokaiyotic and eukaryotic cells see, e.g.. Chapters 16 and 17 of 
Sambrook, et al., Molbojlar Cloning: A Laboratory Manual. 2nd ed.. Cold Spring 
Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989. 

In another embodiment, tihe recombinant mammalian expression vector is capable of 
directing expression of the nucleic acid preferentially in a particular cell type (e.g., 
tissue-specific regulatory elements are used to express the nucleic acid). Tissue-specific 
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regulatory elements are known in the art Non-limiting examples of suitable tissue-specific 
promoters include the albumin promoter (liver-specific; Pinkert, et al., 1987. Genes Dev. 1 : 
268-277), lynq)hoid-specific promoters (Calame and Eaton, 1988. Adv. Immunol. 43: 
235-275), in particular promoters of T ceU receptors (Winoto and Baltimore, 1989. EMBO J. 
8: 729-733) and immunoglobulins (Banerji, et al., 1983. CeU 33: 729-740; Queen and 
Baltimore, 1983. CeU 33: 741-748), neuron-specific promoters (e.g., the neurofilament 
promoter, Byrne and Ruddle, 1989. Proc. Natl. Acad. Sci, USA 86: 5473-5477), 
pancreas-specific promoters (Edlund, et al„ 1985. Science 230: 912-916), and mammary 
gland-specific promoters (e.g., miUc whey promoter; U.S. Pat, No. 4,873,316 and European 
AppUcation PubUcation No. 264,166). DevelopmentaUy-regulated promoters are also 
encompassed, e.g., the murine hox promoters (Kessel and Gruss, 1990. Science 249: 374-379) 
and the -f etoprotem promoter (Campes and TUghman, 1989. Genes Dev. 3: 537-546). 

The invention fiirtfaer provides a recombinant expression vector conq>rising a DNA 
molecule of the invention cloned into the expression vector in an antisense orientation. That 
is, the DNA molecule is operatively-linked to a regulatory sequence in a manner that aUows 
for expression (by transcription of the DNA molecule) of an RNA molecule that is antisense to 
NOV-X mKNA. Regulatory sequences operatively linked to a nucleic acid cloned in the 
antisense orientation can be chosen that direct the continuous e3q)ression of the antisense RNA 
molecule in a variety of ceU types, for instance viral promoters and/or enhancers, or regulatory 
sequences can be chosen that direct constitutive, tissue specific or ceU type specific expression 
of antisense RNA. The antisense expression vector can be in the form of a recombinant 
plasmid, phagemid or attenuated virus in which antisense nucleic acids are produced under the 
control of a high efficiency regulatory region, tihie activity of which can be determined by the 
ceU type into which the vector is introduced. For a discussion of tiie regulation of gene 
e>q)ression using antisense genes see, e.g., Weintraub, et al., "Antisense RNA as a molecular 
tool for genetic analysis," Reviews-Trends in Genetics, Vol. 1(1) 1986. 

Another aspect of the invention pertains to host cells into which a recomhinant 
expression vector of the invention has been introduced. The terms "host ceU" and 
"recombinant host ceU" are used interchangeably herein. It is understood that such terms refer 
not only to the particular subject ceU but also to the progeny or potential progeny of such a 
ceU. Because certain modifications may occur in succeeding generations due to either 
mutation or envuronmental influences, such progeny may not, in fact, be identical to the parent 
ceU, but are stiU included within the scope of the term as used hereiiL 
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A host cell can be any prokaiyotic or eukaryotic cell. For example, NOV-X protein 
can be expressed in bacterial cells such as E. coli, insect cells, yeast or mammalian cells (such 
as human, Chinese hamster ovary cells (CHO) or COS cells). Other suitable host cells are 
known to those skilled in the art. 
5 Vector DNA can be introduced into prokaiyotic or eukaryotic cells via conventional 

transformation or transfection techniques. As used herein, the terms "transformation" and 
"transfection" are intended to refer to a variety of art-recognized techniques for introducing 
foreign nucleic acid (e,g., DNA) into a host cell, including calcium phosphate or calcium 
chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or 

1 0 electroporation- Suitable methods for transforming or transfecting host cells can be found in 
Sambrook, et al, (Molecular Cloning: A Laboratory Manual. 2nd ed. Cold Spring 
Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y,, 1989), 
and other laboratory manuals. 

For stable transfection of mammalian cells, it is known that, depending i^n the 

15 expression vector and transfection technique used, only a small fiaction of cells may integrate 
the foreign DNA into their genome. In order to identify and select these integrants, a gene that 
encodes a selectable marker (e.g., resistance to antibiotics) is generally introduced into the 
host ceUs along with the gene of interest Various selectable markers include those that confer 
resistance to drugs, such as G418, hygromycin and methotrexate. Nucleic acid encoding a 

20 selectable marker can be introduced into a host cell on the same vector as that encoding NOV- 
X or can be introduced on a separate vector. Cells stably transfected with the introduced 
nucleic acid can be identified by drug selection (e.g., cells that have incorporated flie 
selectable marker gene will survive, while the other cells die). 

A host cell of the invention, such as a prokaryotic or eukaryotic host cell in culture, can 

25 be used to produce (i.e., express) NOV-X protein. Accordingjy, the invention further provides 
methods for producing NOV-X protein using the host cells of the inventioiL In one 
embodiment, the method comprises culturing the host cell of invention (into which a 
recombinant egression vector encoding NOV-X protein has been introduced) in a suitable 
medium such that NOV-X protein is produced In another embodiment, the method further 

30 comprises isolating NOV-X protein &om the medium or the host cell. 

Transgenic NOV-X Animals 

The host cells of the invention can also be used to produce non-human transgenic 
animals. For example, in one embodiment, a host cell of the invention is a fertilized oocyte or 
an embryonic stem cell into which NOV-X protein-coding sequences have been introduced. 
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Such host cells can then be used to create non-human transgenic animals in which exogenous 
NO V-X sequences have been introduced into their genome or homologous recombinant 
animals in which endogenous NOV-X sequences have been altered Such animals are useful 
for studying the fimction and/or activity of NOV-X proteta and for identifying and/or 
evaluating modulators of NOV-X protein activity. As used herein, a "transgenic animal" is a 
non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in 
which one or more of the cells of the animal includes a transgene. Other examples of 
transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, 
amphibians, etc. A transgene is exogenous DNA that is integrated into the genome of a cell 
from which a transgenic animal develops and that remains in the genome of the mature 
animal, thereby directing the expression of an encoded gene product in one or more cell types 
or tissues of the transgenic animal. As used herein, a "homologous recombinant animal" is a 
non-human animal, preferably a mammal, more preferably a mouse, in which an endogenous 
NOV-X gene has been altered by homologous recombination between the endogenous gene 
and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell 
of the animal, prior to development of the animal. 

A transgenic animal of the invention can be created by introducing NOV-X-encoding 
nucleic acid into the male pronuclei of a fertilized oocyte (e.g., by microinjection, retroviral 
infection) and allowing the oocyte to develop in a pseudopregnant female foster animal. 
Sequences including SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57 can be 
introduced as a transgene into the genome of a non-human animal. Alternatively, a non- 
human homologue of the human NOV-X gene, such as a mouse NOV-X gene, can be isolated 
based on hybridization to the human NOV-X cDNA (described further supra) and used as a 
transgene. Intronic sequences and polyadenylation signals can also be included in the 
transgene to increase the efficiency of expression of the transgene. A tissue-specific 
regulatory sequence(s) can be operably-lmked to ttie NOV-X transgene to direct ejqiression of 
NOV-X protein to particular cells. Methods for generating transgenic animals via embryo 
nianipulation and microinjectioii, particularly animds such as mice, have become 
conventional in the art and are described, for example, in U.S. Patent Nos. 4,736,866; 
4,870,009; and 4,873,191; andHogan, 1986. In: Manipulating THE Mouse Embryo, Cold 
Spring Harbor Laboratory Press, Cold Spring Harbor, N. Y. Similar methods are used for 
production of o&er transgenic animals. A transgenic founder animal can be identified based 
^Jpon the presence of the NOV-X transgene in its genome and/or expression of NOV-X mRNA 
m tissues or cells of the animals. A transgenic founder animal can then be used to breed 
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additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene- 
encoding NOV-X protein can ftirther be bred to other transgenic animals carrying other 
transgenes. 

To create a homologous recombinant animal, a vector is prepared which contains at 
least a portion of a NOV-X gene into which a deletion, addition or substitution has been 
introduced to thereby alter, e.g., functionally disrupt, the NOV-X gene. The NOV-X gene can 
be a human gene (e.g., flie DNA of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57), 
but more preferably, is a non-human homologue of a human NOV-X gene. For example, a 
mouse homologue of human NOV-X gene of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 
22, or 57 can be used to constmct a homologous recombination vector suitable for altering an 
endogenous NOV-X gene in the mouse genome. In one embodiment, the vector is desig&ed 
such that, upon homologous recombination, the endogenous NOV-X gene is functionally 
disrupted (i.e., no longer encodes a functional protein; also referred to as a "knock out" 
vector). 

Alternatively, the vector can be designed such that, upon homologous recombination, 
the endogenous NOV-X gene is mutated or otherwise altered but still encodes functional 
protein (e.g., the upstream regulatory region can be altered to thereby alter the e3q)iession of 
the endogenous NOV-X protein). In the homologous recombination vector, flie altered portion 
of the NOV-X gene is flanked at its 5*- and 3'-termini by additional nucleic acid of the NOV-X 
gene to allow for homologous recombination to occur between the exogenous NOV-X gene 
carried by the vector and an endogenous NOV-X gene in an embryonic stem cell. The 
additional flanking NOV-X nucleic acid is of sufficient lengtii for successful homologous 
recombination with the endogenous gene. Typically, several kflobases of flanking DNA (both 
at the 5'- and 3'-termini) are included in the vector. See, e.g., Thomas, et al., 1987. Cell 51 : 
503 for a description of homologous recombination vectors. The vector is ten introduced into 
an embryonic stem cell line (e.g., by electroporation) and cells in which the introduced NOV- 
X gene has homologously-recombined witii the endogenous NOV-X gene are. selected. See, 
e.g., Li, et aL, 1992. CeD 69: 915. 

The selected cells are thai injected into a blastocyst of an animal (e.g., a mouse) to 
form aggregation chimeras. See, e.g., Bradley, 1987. In: Teratocaronomas and 
Embryonic Stem Cells: A Practical Approach, Robertson, ed. IRL, Oxford, pp. 1 13-152. 
A chimeric embryo can then be inq)lanted into a suitable pseudopregnant female foster animal 
and the embryo brought to term. Progeny harboring the homologously-recombined DNA in 
their germ cells can be used to breed animals in which all cells of the animal contain the 
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homologously-recombined DNA by gennline transmission of the transgene. Methods for 
constructing homologom recombination vectors and homologous recombinant animals are 
described further in Bradley, 1991. Cuir. Opin. Biotechnol. 2: 823-829; PCT Inteniational 
PubUcation Nos.: WO 90/1 1354; WO 91/01 140; WO 92/0968; and WO 93/04169. 

In another embodiment, transgenic non-humans animals can be produced fliat contain 
selected systems that allow for regulated expression of the transgene. One example of such a 
system is the cre/IoxP recombinase system of bacteriophage PI. For a description of the 
cre/loxP recombinase system, See, e.g., Lakso, et al., 1992. Proc. NaU. Acad. Sci. USA 89: 
6232-6236. Another example of a recombinase system is the FLP recombinase system of 
Saccharomycescerevisiae. See, O'Gorman, et al, 1991. Science 251:1351-1355. Ifacre/loxP 
recombinase system is used to regulate expression of the transgene, animals containing 
transgenes eucoding both the Ore recombinase and a selected protein are required. Such 
animals can be provided through the construction of "double" transgenic animals, e.g., by 
mating two transgenic animals, one containing a transgene encoding a selected protein and the 
other containing a transgene encoding a recombinase. 

Clones of the non-human transgenic animals described hemn can also be produced 
according to the methods described in Wihnut, et al., 1997. Nature 385: 810-813. IhbrieC a 
cell (e.g., a somatic cell) finm the transgenic animal can be isolated and induced to exit the 
growth cycle and enter Go phase. The quiescent ceU can then be fused, e.g., through the use of 
electrical pulses, to an enucleated oocyte from an animal of the same species fixtm which the 
quiescent cell is isolated. The reconstructed oocyte is then cultured such that it develops to 
morula or blastocyte and then transferred to pseudopregnant female foster animaL The 
offepring borne of this female foster animal win be a clone of the animal fiom which the cell 
(e.g., the somatic cell) is isolated. 

Pharmaceutical Compositions 

The NOV-X nucleic acid molecules, NOV-X proteins, and anti-NOV-X antibodies 
(also referred to herein as "active compounds") of the invention, and derivatives, fiagments, 
analogs and homologs thereof can be incorporated into pharmaceutical compositions suitable 
for administration. Such compositions typically comprise tiie nucleic acid molecule, protein, 
or antibody and a phaimaceutically acceptable carrier. As used herein, "phannaceutically 
acceptable carrier" is intended to include any and all solvents, dispersion media, coatings, 
antiT)acterial and antifungal agents, isotonic and absorption delaying agents, and ttie like, 
compatible with pharmaceutical administration. Suitable carriers are described in the most 
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recent edition of Remington's Pharmaceutical Sciences, a standard reference text in the field, 
which is incorporated herein by reference. Preferred examples of such carriers or diluents 
include, but axe not limited to, water, saline, finger's solutions, dextrose solution, and 5% 
human serum albumin. Liposomes and non-aqueous vehicles such as fixed oils may also be 
used. The use of such media and agents forphannaceutically active substances is well known 
in the art Except insofar as any conventional media or agent is incompatible with the active 
compound, use thereof in the conopositions is contemplated. Supplementary active 
compoimds can also be incorporated into the compositions. 

The antibodies disclosed herein can also be formulated as immunoliposomes. 
Liposomes containing the antibody are prepared by methods known in the art, such as 
described in Epstein et al., Proc. Natl. Acad. Sci. USA, 82: 3688 (1985); Hwang et al., Proc. 
Natl Acad. Sci. USA, 77: 4030 (1980); and U.S. Pat Nos. 4,485,045 and 4,544,545. 
Liposomes with enhanced circulation time are disclosed in U.S. Patent No. 5,013,556. 

Particularly usefid Uposomes can be generated by flie reverse-phase eviration 
method with a lipid composition conq)rising phosphatidylcholme, cholesterol, and PEG- 
derivatized phosphatidylethanolamine (PEG-PE). Liposomes are extruded through filters of 
defined pore size to yield liposomes with the desired diameter. Fab* firagments of the antibody 
of the present invention can be conjugated to the liposomes as described in Martin et al ., J. 
Biol, Chem., 257: 286-288 (1982) via a disulfid^interchange reaction. A chemotherapeutic 
agent (such as Doxorubicin) is optionally contained within the liposome. See Gabizon et al., J. 
National Cancer IhsL, 81(19): 1484 (1989). 

A pharmaceutical composition of the invention is formulated to be compatible with its 
intended route of administration. Examples of routes of administration include parenteral, 
e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (i.e., topical), 
transmucosal, and rectal administratioiL Solutions or suspensions used for parenteral, 
intradermal, or subcutaneous plication can include the following components: a sterile 
diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, 
propylene glycol or oflier synthetic solvents; antibacterial agents such as benzyl alcohol or 
methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such 
as ethylenediaminetetraacetic acid (EDTA); buffers such as acetates, citrates or phosphates, 
and agents for the adjustment of tonicity such as sodium chloride or dextrose. The pH can be 
adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral 
prq)aration can be enclosed in anipoules, disposable syringes or multiple dose vials made of 
glass or plastic. 
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Phardiaceutical compositions suitable for injectable use include sterile aqueous 
solutions (where water soluble) or dispersions and sterile powders for the extemporaneous 
preparation of sterile injectable solutions or dispersion. For intravenous administration, 
suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, 
Parsippany, NJ.) or phosphate buffered saline (PBS). In aU cases, the composition must be 
sterile and should be fluid to the extent that easy qrringeabihtyexiste^ It must be stable under 
the conditions of manufacture and storage and must be preserved against the contaminating 
action of microorganisms such as bacteria and fimgi. The carrier can be a solvent or 
dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, 
propylene glycol, and liquid polyethylene glycol, and the like), and suitable mixtures thereof 
The proper fluidity can be maintained, for exaiiq)le, by the use of a coating such as lecithin, by 
the maintenance of the required particle size in the case of dispersion and by the use of 
surfactants. Prevention of the action of microorganisms can be achieved by various 
antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic 
acid, fliimerosal, and tiie like. In many cases, it will be preferable to include isotonic agents, 
for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the 
composition. Prolonged absorption of the injectable conqiositions can be brought about by 
including in the conqwsition an agent which delays absorption, for example, aluminum 
monostearate and gelatin. 

Sterile injectable solutions can be prepared by incorporating the active compound (e.g., 
a NOV-X protein or anti-NOV-X antibody) in the required amount in an appropriate solvent 
with one or a combination of ingredients enumerated above, as required, foDowed by filtered 
sterilization. Generally, dispersions are prepared by incorporating the active compound into a 
sterile vehicle that contains a basic dispersion medium and the required other ingredients fiom 
those enumerated above. In the case of sterfle powders for the preparation of sterile injectable 
solutions, methods of preparation are vacuum drying and fieeze-drying that yields a powder of 
the active ingredient plus any additional desired ingredient firom a previously sterile-filtered 
solution thereof. 

Oral compositions generally include an inert diluent or an edible carrier. They can be 
enclosed in gelatin c^sules or compressed into tablets. For the purpose of oral therapeutic 
administration, the active compound can be incorporated witii excipients and used in the form 
of tablets, troches, or capsules. Oral compositions can also be prepared using a fluid carrier 
for use as a mouthwash, wherein the compound in flie fluid carrier is ^plied orally and 
swished and expectorated or swallowed, Phannaceutically compatible binding agents, and/or 
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adjuvant materials can be included as part of the composition. The tablets, pills, c^sules, 
troches and the like can contain any of the foDowing ingredients, or compounds of a similar 
nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient 
such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or com starch; a 
lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a 
sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, 
methyl saKcylate, or orange flavoring. 

For administration by inhalation, the compounds are deUvered in the form of an 
aerosol spray from pressured container or dispenser which contains a suitable propellant, e,g., 
a gas such as carbon dioxide, or a nebulizer. 

Systemic administration can also be by transmucosal or transdermal means. For transmucosal 
or transdermal administration, penetrants appropriate to the barrier to be permeated are used in 
the formulation. Such penetrants are generally known in the art, and include, for example, for 
transmucosal administration, detergents, bile salts, and fiisidic acid derivatives. Transmucosal 
administration can be accompUshed through the use of nasal sprays or suppositories. For 
transdermal administration, the active compounds are formulated into ointments, salves, gels, 
or creams as generally known in the art. 

The compounds can also be prepared in the form of suppositories (e.g., with 
conventional suppository bases such as cocoa butter and ottier glycerides) or retention enemas 
for rectal deU very. 

In one embodiment, the active compounds are prepared with carriers that will protect 
the compound against rapid elimination from the body, such as a controlled release 
formulation, including implants and microenc^sulated dehveiy systems. Biodegradable, 
biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, 
polyglycohc acid, coUagen, polyorthoesters, and polylactic acid. Methods for preparation of 
such formulations will be parent to those skilled in the art The materials can also be 
obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal 
suspensions (including Kposomes targeted to infected cells with monoclonal antibodies to viral 
antigens) can also be used as pharmaceutically acceptable carriers. These can be prepared 
according to methods known to those skilled m the art, for example, as described in U.S. 
Patent No. 4,522,811. 

It is especially advantageous to formulate oral or parenteral compositions in dosage unit form 
for ease of administration and uniformity of dosage. Dosage unit form as used herein refers to 
physically discrete units suited as unitary dosages for the subject to be treated; each unit 
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containing a predetermined quantity of active conq>oimd calculated to produce the desired 
therapeutic effect in association with the required pharmaceutical carrier. The specification 
for the dosage unit forms of the invention are dictated by and directly dependent on the 
unique characteristics of the active compound and the particular therapeutic effect to be 
achieved, and the limitations inherent in the art of coinpounding such an active compound for 
the treatment of individuals. 

The nucleic acid molecules of the invention can be inserted into vectors and used as 
gene therapy vectors. Gene therapy vectors can be dehvered to a subject by, for example, 
intravenous injection, local administration (see, e.g., U.S. Patent No. 5,328,470) or by 
stereotactic injection (see, e.g., Chen, et al., 1994. Proc. Natl. Acad. Sci. USA 91: 3054-3057). 
The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector 
in an acceptable diluent, or can comprise a slow release matrix in which the gene delivery 
vehicle is imbedded Alternatively, where the complete gene delivery vector can be produced 
intact fiom recombinant cells, e.g., retroviral vectors, flie pharmaceutical prqparation can 
include one or more cells that produce the gene delivery system. 

Antibodies specifically binding a protein of the iavention, as well as other molecules 
identified by the screening assays disclosed herein, can be administered for the treatment of 
various disord^ in the form of pharmaceutical compositions. Principles and considerations 
involved in preparing such compositions, as well as guidance in the choice of components are 
provided, for example, in Remington : The Science And Practice Of Pharmacy 19th ed. 
(Alfonso R. Gemiaro, et al., editors) Mack Pub. Co., Easton, Pa. : 1995; Drug Absorption 
Enhancement : Concepts, Possibilities, Lunitations, And Trends, Harwood Academic 
PubUshers, Langhome, Pa., 1994; and Peptide And Protem Drug Delivery (Advances In 
Parenteral Sciences, Vol. 4), 1991, M. Dekker, New York. If the antigenic protein is 
intracellular and whole antibodies are used as inhibitors, internalizing antibodies are preferred. 
However, liposomes can also be used to deliver the antibody, or an antibody fiagment, into 
cells. Where antibody fiagments are used, flie smallest inhibitory fiagment that specifically 
binds to the binding domain of the target protein is preferred For example, based upon the 
variable-region sequences of an antibody, peptide molecules can be designed fliat retain the 
ability to biud the target protein sequence. Such peptides can be synthesized chemically 
and/or produced by recombinant DNA technology. See, e.g., Marasco et al., 1993 Proc. Natl. 
Acad, Sci. USA, 90: 7889-7893. The formulation herem can also contain more than one 
active compound as necessary for the particular indication being treated, preferably those witii 
complementary activities that do not adversely affect each other. Alternatively, or in addition. 
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flie conq)osition can comprise an agent that enhances its function, such as, for example, a 
cytotoxic agent, cytokine, chemotherapeutic agent, or growth-inhibitory agent. Such 
molecules are suitably present in combination in amounts that are eflfective for the purpose 
intended The active ingredients can also be entr^ed in microc^sules prepared, for 
example, by coacervation techniques or by interfacial polymerization, for example, 
hydroxymethylcellulose or gelatin-miciocapsules and poly-(methylmethacrylate) 
microcapsules, respectively, in colloidal drug delivery systems (for example, liposomes, 
albumin microspheres, ncdcroemulsions, nano-particles, and nanocapsules) or in 
macroemulsions. 

The formulations to be used for in vivo administration must be sterile. This is readily 
accomplished by filtration through sterile filtration membranes. 

Sustained-release preparations can be prepared. Suitable examples of sustained-release 
prq)arations include semipenneable matrices of solid hydrophobic polymers containing the 
antibody, which matrices are in the form of sh^ed articles, e.g., fihns, or microc^sules. 
Exanrples of sustained-release matrices include polyesters, hydrogels (for example, poIy(2- 
hydroxyefhyl-methaciylate), or poly(vinylalcohol)), polylactides (U.S. Pat No. 3,773,919), 
copolymers of L-ghitamic acid and ethyl-L-glutamate, non-degradable ethylene-vinyl 
acetate, degradable lactic acid-glycolic acid copolymers such as the LUPRON DEPOT ™ 
(injectable microspheres composed of lactic acid-glycolic acid copolymer and leiq)rolide 
acetate), and poly-D-(-)-3-hydroxybutyric add. While polymers such as ethylene-vinyl 
acetate and lactic acid-glycolic acid enable release of molecules for over 100 days, certain 
hydrogels release proteins for shorter time periods. 

The pharmaceutical compositions can be included in a container, pack, or dispenser 
together with instructions for administration. 

Screening and Detection Methods 

The isolated nucleic acid molecules of the invention can be used to express NOV-X 
protein (e.g., via a recombinant expression vector in a host ceU in gene therapy applications), 
to detect NOV-X mRNA (e.g., m a biological sample) or a genetic lesion in a NOV-X gene, 
and to modulate NOV-X activity, as described fiirther, below. In addition, the NOV-X 
proteins can be used to screen drugs or compounds that modulate the NOV-X protein activity 
or ejq)ression as well as to treat disorders characterized by insufficient or excessive production 
of NOV-X protem or productidn of NOV-X protem forms that have decreased or aberrant 
activity compared, to NOV-X wild-type protem. In addition, the anti-NOV-X antibodies of the 
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invention can be used to detect and isolate NOV-X proteins and modulate NO V-X activity. 
For example, NOV-X activity includes growth and differentiation, antibody production, and 
tumor growth. 

The invention fiiriher pertains to novel agents identified by the screening assays 
described herem and uses thereof for treatments as described, supra. 

Screening Assays 

The invention provides a method (also referred to herein as a "screening assay") for 
identifying modulators, i.e., candidate or test con:g>ounds or agents (e.g., peptides, 
peptidomimetics, small molecules or other drugs) that bind to NOV-X proteins or have a 
stimulatory or inhibitory eJBFect on, e.g,, NOV-X protein expression or NOV-X protein activity. 
The invention also includes compounds identified in the screening assays described herein. 

In one embodiment, the invention provides assays for screening candidate or test 
compounds which bind to or modulate the activity of the membrane-bound form of a NOV-X 
protem or polypeptide or biologically-active portion thereof. The test compounds of the 
invention can be obtained using any of the numerous £5)pix)aches in combinatorial library 
methods known in the art, including: biological libraries; spatially addressable parallel solid 
phase or solution phase libraries; synthetic library methods requiring deconvolution; the 
"one-bead one-comqpound" library method; and synthetic library methods using affinity 
chromatography selection. The biological library ^proach is limited to peptide libraries, 
while the other four ^roaches are q)plicable to peptide, non-peptide oligomer or small 
molecule hT>raries of compounds. See, e.g.. Lam, 1997. Anticancer Drug Design 12: 145. 

A "small molecule" as used herein, is meant to refer to a conq)osition that has a 
molecular weight of less than about 5 kD and most preferably less than about 4 kD, Small 
molecules can be, e.g.,* nucleic acids, peptides, polypeptides, peptidomimetics, carbohydrates, 
lipids or other organic or inorganic molecules. Libraries of chemical and/or biological 
mbctiires, such as fungal, bacterial, or algal extracts, are known in the art and can be screened 
with any of the assays of the inventioa 

Examples of methods for the synthesis of molecular Ubraries can be found in the art, 
for example in: DeWitt, et al., 1993. Proc. Natl. Acad. Sci. U.S.A. 90: 6909; Erb, et d., 1994. 
Proc. Natl. Acad. Sci. U.S.A. 91: 11422; Zuckermann, et al., 1994. J. Med. Chem. 37: 2678; 
Cho, et al., 1993. Science 261: 1303; Carrell, et al., 1994. Angew. Chem. Int. Ed. Engl. 33: 
2059; Carell, et al., 1994. Angew. Chem. hit. Ed. Engl. 33: 2061; and Gallop, et al., 1994. J. 
Med. Chem. 37: 1233. 
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Libraries of compounds may be presented in solution (e.g., Houghten, 1992. 
Biotechniques 13: 412-421), or on beads (Lam, 1991. Nature 354; 82-84), on chips (Fodor, 
1993. Nature 364: 555-556), bacteria (Ladner, U.S. Patent No. 5,223,409), spores (Ladner, 
U.S. Patent 5,233,409), plasmids (Cull, et al., 1992. Proc. Natl. Acad. Sci. USA 89: 
1865-1869) or on phage (Scott and Smith, 1990. Science 249: 386-390; Devlin, 1990. Science 
249: 404-406; Cwirla, et al., 1990. Proc. Natl. Acad. Sci. U.S A 87: 6378-6382; FeKd, 1991. 
J. MoL Biol. 222: 301-310; Ladner, U.S. Patent No. 5,233,409.). 

In one embodiment, an assay is a cell-based assay in which a cell which ejqjiesses a 
membrane-bound form of NOV-X protein, or a biologically-active portion thereof on the ceU 
surfece is contacted with a test compound and the ability of the test compound to bind to a 
NOV-X protein determined. The cell, for example, can be of mammalian origin or a yeast 
ceU. Determining flie ability of the test compound to bind to the NOV-X protein can be 
accomplished, for example, by coiq)ling the test compound with a radioisotope or eflzymatic 
label such that binding of the test compound to the NOV-X protein or biologically-active 
portion thereof can be detemiined by detecting the labeled conipound in a complex. For 
example, test conq)ounds can be labeled with ^\ '^C, or % either directly or indirectly, 
and the radioisotope detected by direct counting of radioemission or by scintiHation counting. 
Alternatively, test compounds can be fflizymatically-labeled with, for exan^le, horseradish 
peroxidase, alkaline phosphatase, or ludferase, and the enzymatic label detected by 
deteimination of conversion of an ^ropriate substrate to product In one embodimait, the 
assay comprises contacting a cell which expresses a membrane-bound form of NOV-X 
protein, or a biologically-active portion thereof on the cell surfece with a known compound 
which binds NOV-X to form an assay mixture, contacting the assay mixture with a test 
compound, and determining the ability of the test compound to interact with a NOV-X protein, 
wherein determining the ability of the test compound to interact with a NOV-X protein 
comprises determining the ability of the test compound to preferentially bind to NOV-X 
protein or a biologjcally-active portion thereof as compared to the known compound. 

In another embodiment an assay is a cell-based assay comprising contacting a cell 
expressing a membran&-bound form of NOV-X protem, or a biologically-active portion 
thereof on the cell surface with a test conq)ound and determining the ability of the test 
compound to modulate (e.g., stimulate or inhibit) the activity of the NOV-X protein or 
biologically-active portion thereof. Determining the abiUty of the test compound to modulate 
the activity of NOV-X or a biologically-active portion thereof can be accomplished, for 
example, by determining the ability of the NOV-X protein to bind to or interact with a NOV-X 
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target molecule. As used herein, a "target molecule" is a molecule with which a NO V-X 
protein binds or interacts in nature, for Gxamplo, a molecule on the surface of a cell which 
expresses a NOV-X interacting protein, a molecule on the surface of a second cell, a molecule 
in the ejctracellular milieu, a molecule associated with the intemal .surfece of a cell membrane 
or a cytoplasmic molecule. A NOV-X target molecule can be a non-NOV-X molecule or a 
NOV-X protein or polypeptide of the invention In one embodiment, a NOV-X target 
molecule is a conq)onent of a signal transduction pathway that facilitates transduction of an 
extracellular signal (e.g. a signal generated by binding of a compound to a membrane-bound 
NOV-X molecule) through the cell membrane and into the ceD. The target, for exanq)le, can 
be a second intercellular protein that has catalytic activity or a protem that facilitates the 
association of downstream signaling molecules with NOV-X, 

Determining the ability of the NOV-X protein to bind to or interact with a NOV-X 
target molecule can be acconq)lished by one of the methods described above for detennining 
direict binding. 

In one embodiment, determining the ability of the NOV-X protein to bind to or 
interact with a NOV-X target molecule can be acconiplished by determining the activity of the 
target molecule. For example, the activity of the target molecule can be detemiined by 
detecting mduction of a cellular second messenger of the target (i.e. intracellular Ca^^ 
diacylglycerol, IP3, etc.), detecting catalytic/enzymatic activity of the target an appropriate 
substrate, detecting the induction of a reporter gene (comprising a NOV-X-iesponsive 
regulatory element operatively linked to a nucleic acid encoding a detectable marker, e.g., 
luciferase), or detecting a cellular response, for example, cell survival, cellular differentiation, 
or cell proliferatiorL 

In yet another embodiment, an assay of the mvention is a cell-free assay comprising 
contacting a NOV-X protein or biologically-active portion ttiereof with a test compound and 
determining the ability of the test compound to bind to the NOV-X protein or biologically- 
active portion thereof Bindmg of the test conq)ound to the NOV-X protein can be determined 
either directly or indirectly as described above. 

In one such embodiment, the assay comprises contacting the NOV-X protein or 
biologically-active portion thereof with a known compound which binds NOV-X to fonn an 
assay mixture, contacting the assay mixture with a test compound, and determining the abihty 
of the test compound to interact with a NOV-X protein, wherein detennining the ability of the 
test compound to interact with a NOV-X protein comprises determining the ability of the test 
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conq>oimd to preferentially bind to NOV-X or biologically-active portion thereof as compared 
to the known compound. 

In still another embodiment, an assay is a cell-free assay comprising contacting NOV- 
X protein or biologically-active portion thereof with a test compound and detemiining the 
5 ability of the test compound to modulate (e.g, stimulate or inhibit) flie activity of the NOV-X 
protem or biologically-active portion thereof Determining the abiKty of the test compound to 
modulate the activity of NOV-X can be accomplished, for example, by determining the ability 
of the NOV-X protein to bind to a NOV-X target molecule by one of the methods described 
above for determining direct binding. In an alternative embodiment, determining the ability of 

10 the test compound to modulate the activity of NOV-X protein can be accomplished by 

detemiining the ability of the NOV-X protein fiirther modulate a NOV-X target molecule. For 
exanq)le, the catalytic/enzymatic activity of the target molecule on an appropriate substrate 
can be determined as described above. 

In yet another embodhnent, the cell-free assay comprises contacting the NOV-X 

15 protein or biologically-active portion thereof with a known compound which binds NOV-X 
protein to form an assay mixture, contacting the assay mixture with a test compound, and 
determining the ability of the test compound to interact with a NOV-X proteui, wherein 
determining the ability of the test compound to interact with a NOV-X protein comprises 
determining the ability of the NOV-X protein to preferentially bind to or modulate the activity 

20 of a NOV-X target molecule. 

The cell-free assays of the invention are amenable to use of both the soluble form or 
the membrane-bound form of NOV-X protein. In the case of cell-free assays comprising the 
membrane-bound form of NOV-X protein, it may be desirable to utilize a solubilizing agent 
such that the membrane-bound form of NOV-X protein is maintained in solution. Examples 

25 of such solubilizing agents include non-ionic detergents such as n-octylglucoside, 
n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-mefhylglucamide, 
decanoyl-N-methylglucamide, Triton® X-100, Triton® X-1 14, Thesit®, 
Isotridecypoly(ethylene glycol efher)n, N-dodecyl-NJ^-dimefhyl-3-ammonio-l-ptopane 
sulfonate, 3-(3-cholamidopropyl) dimethylamminiol-1 -propane sulfonate (CHAPS), or 

30 3-(3-cholamidopropyl)dimefliylamnMmol-2-hydroxy-l-propane sulfonate (CHAPSO). 

In more than one embodiment of Hhe above assay methods of the invention, it may be 
desirable to immobilize either NOV-X protein or its target molecule to facilitate separation of 
complexed from uncon[iplexed forms of one or both of the proteins, as well as to acconomodate 
automation of the assay. Binding of a test con5>ound to NOV-X protein, or interaction of 
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NOV-X protein with a target molecule in the presence and absence of a candidate coiiq)ound, 
can be accomplished in any vessel suitable for containing the reactants. Examples of such 
vessels include microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a 
fusion protein can be provided that adds a domain that allows one or both of the proteins to be 
bound to a matrix. For example, GST-NOV-X fusion proteins or GST-target fusion proteins 
can be adsorbed onto glutathione sephaiose beads (Sigma Chemical, St. Louis, MO) or 
glutathione derivatized microtiter plates, that are flien combined with the test compound or the 
test compound and either the non-adsoibed target protein or NOV-X protein, and the mixture 
is incubated under conditions conducive to complex formation (e.g., at physiological 
conditions for salt and pH). Following incubation, the beads or microtiter plate wells are 
washed to remove any unbound components, the matrix immobilized in the case of beads, 
complex determined either directly or indirectly, for example, as described, supra. 
Alternatively, the complexes can be dissociated &om the matrix, and the level of NOV-X 
protein binding or activity detennined using standard techniques. 

Other techniques for immobilizing proteins on matrices can also be used in the 
screening assays of the invention. For example, either the NOV-X protein or its target 
molecule can be immobiUzed utilizing conjugation of biotin and streptavidin. Biotmylated 
NOV-X protein or target molecules can be prepared fix)m biotin-NHS 
(N-hydroxy-succinimide) using techniques well-known within the art (e.g., biotinylation kit. 
Pierce Chemicals, Rockford, 111.), and immobihzed in the wells of streptavidin-coated 96 well 
plates (Pierce Chemical). Alternatively, antibodies reactive with NOV-X protem or target 
molecules, but which do not interfere with binding of the NOV-X protein to its target 
molecule, can be derivatized to the wells of the plate, and unbound target or NOV-X protem 
trapped in the wells by antibody conjugation. Methods for detecting such cdmplexes, in 
addition to those described above for the GST-hnmobilized complexes, mclude 
immunodetection of conq)lexes using antibodies reactive with the NOV-X protein or target 
molecule, as well as enzyme-linked assays that rely on detecting an enzymatic activity 
associated with the NOV-X protein or taiget molecule. 

In another embodiment, modulators of NOV-X protein expression are identified in a 
method wherein a cell is contacted with a candidate compound and the ejq)ression of NOV-X 
mRNA or protein m the cell is detennined, Thelevelof expression of NOV-X mRNA or 
protem in the presence of the candidate compound is conipared to the level of e)q)ression of 
NOV-X mRNA or protein m the absence of the candidate compound. The candidate 
compound can then be identified as a modulator of NOV-X mRNA or protein e3q)ression 
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based vpon Has comparison. For example, when e}qjression of NOV-X mRNA or protein is 
greater (i.e., statistically significantly greater) in the presence of the candidate compound than 
in its absence, the candidate confound is identified as a stimulator of NOV-X mRNA or 
protein eq>ression. Alternatively, when expression of NOV-X mRNA or protein is less 
(statistically significantly less) m the presence of the candidate compound than in its absence, 
the candidate compound is identified as an inhibitor of NOV-X mRNA or protein expression. 
The level of NOV-X mRNA or protein expression in the cells can be determined by mefliods 
described herein for detecting NOV-X mRNA or protein. 

In yet another aspect of the invention, the NOV-X proteins can be used as "bait 
proteins" in a two-hybrid assay or fliree hybrid assay (see, e.g., U.S. Patent No. 5,283,3 17; 
Zervos, et al., 1993. Cell 72: 223-232; Madura, et al., 1993. J. Biol. Chem. 268: 12046-12054; 
Bartel, et al., 1993. Biotechniques 14: 920-924; Iwabuchi, et aL, 1993. Oncogene 8: 
1693-1696; and Brent WO 94/10300), to identify other proteins that bmd to or interact with 
NOV-X ("NOV-X-binding proteins" or "NOV-X-bp") and modulate NOV-X activity. Such 
NOV-X-binding proteins are also likely to be involved in the prop^ation of signals by the 
NOV-X proteins as, for exanq)le, iqjstream or downstream elements of the NOV-X pathway. 

The two-hybrid sjretem is based on the modular nature of most transcription fectors, 
which consist of separable DNA-binding and activation domains. Briefly, flie assay utilizes 
two different DNA constructs. In one construct, flie gene that codes for NOV-X is fiised to a 
gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In the 
other construct, a DNA sequence, &om a library of DNA sequences, that encodes an 
unidentified protein C'prey" or "sample") is fused to a gene that codes for the activation 
domain of the known transcriptidn factor. If the "bait" and the "prey" proteins are able to 
mteract, in vivo, formmg a NO V-X-dependent complex, the DNA-binding and activation 
domains of the transcription factor are brought into close proximity. This proximity allows 
transcription of areportergaie (e.g., LacZ) that is operably linked to a transcriptional 
regulatory site responsive to the transcription fictor. Expression of the reporter gene can be 
detected and cell colonies containing the fimctional transcription factor can be isolated and 
used to obtain the cloned gene that encodes the protein which interacts with NOV-X. 

The invention further pertains to novel agents identified by the Aforementioned 
screening assays and uses tiiereof for treatments as described herein. 
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Detection Assays 

Pordons or jfragtnents of the cDNA sequences identified herein (and the corresponding 
complete gene sequences) can be used in numerous ways as polynucleotide reagents. By way 
of example, and not of limitation, these sequences can be used to: (i) identify an individual 
from a minute biological sample (tissue typing); and (ii) aid in forensic identification of a 
biological sample. Some of these plications are described in the subsections, below. 

Tissue Typing 

The NOV-X sequences of the invention can be used to identify individuals fiom minute 
biological san5)les. In this technique, an individual's genomic DNA is digested with one or 
more restriction enzymes, and probed on a Southern blot to yield unique bands for 
identification. The sequences of the invention are useful as additional DNA markers for RFLP 
C*rcstriction fiagment length polymorphisms," described in U.S. Patent No. 5,272,057). 

Furthermore, the sequences of the invention can be used to provide an alternative 
technique that determines the actual base-by-base DNA sequence of selected portions of an 
individual's genome. Thus, the NOV-X sequences described herem can be used to prepare 
two PGR primers fiom the 5*- and 3*-tennini of the sequences. These primers can then be used 
to amplify an individual's DNA and subsequenfly sequence it 

Panels of corresponding DNA sequences fiom individuals, prepare in this manner, 
can provide unique individual identifications, as each individual will have a unique set of such 
DNA sequences due to allelic differences. The sequences of the invention can be used to 
obtain such identification sequences &om individuals and fiom tissue. The NOV-X sequences 
oftiie invention uiiiquely represent portions ofthe human genome. Allelic variation occurs to 
some degree in flie coding regions of these sequmces, and to a greater degree in the noncoding 
regions. It is estimated that allelic variation between individual humans occms with a 
frequency of about once per each 500 bases. Much ofthe allelic variation is due to single 
nucleotide polymorphisms (SNPs), which include restriction fragment length polymorphisms 
(RFLPs). 

Each ofthe sequences described herein can, to some degree, be used as a standard 
against which DNA from an individual can be compared for identiification purposes. Because 
greater numbers of polymorphisms occur in the noncoding regions, fewer sequences are 
necessary to diflferentiate individuals. The noncoding sequences can comfortably provide 
positive individual identification with a panel of perh^s 10 to 1,000 primers tiiat each yield a 
noncoding amplified sequence of 100 bases. If predicted coding sequences, such as those in 
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ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57 are used, a more ^piopriate number of 
primers for positive individual identification would be 500-2,000, 



Predictive Medicine 

The invention also pertains to the field of predictive medicine in which diagnostic 
assays, prognostic assays, pharmacogenomics, and monitoring clinical trials are used for 
prognostic (predictive) purposes to thereby treat an individual prophylactically. Accordingly, 
one aspect of the invention relates to diagnostic assays for determining NOV-X protein and/br 
nucleic acid expression as well as NOV-X activity, in the context of a biological sample (e.g., 
blood, serum, cells, tissue) to thereby determine whetiier an individual is afflicted with a 
disease or disorder, or is at risk of developing a disorder, associated with aberrant NOV-X 
expression or activity. Disorders associated with aberrant NOV-X expression of activity 
include, for example, disorders of olfactory loss, e.g. trauma, HIV illness, neoplastic growth, 
and neurological disorders, e.g. Parkinson's disease and Alzheimer's disease. 

The invention also provides for prognostic (or predictive) assays for determining 
whether an individual is at risk of developmg a disorder associated with NOV-X protein, 
nucleic acid expression or activity. For example, mutations in a NOV-X gene can be assayed 
in a biological sample. Such assays can be used for prognostic or predictive purpose to 
thereby prophylactically treat an individual prior to the onset of a disorda: characterized by or 
associated with NOV-X protein, nucleic acid expression, or biological activity. 

Another aspect of the invention provides methods for determining NOV-X protem, 
nucleic acid expression or activity in an individual to thereby select appropriate therq)eutic or 
prophylactic agents for that individual (referred to herein as "pharmacogenomics"). 
Pharmacogenomics allows for the selection of agents (e.g., drugs) for therapeutic or 
prophylactic treatment of an uidividual based on the genotype of the individual (e.g., the 
genotype of the individual examined to determine the ability of the individual to respond to a 
particular agent) 

Yet another aspect of the invention pertains to monitoring the influence of agents (e.g., 
drugs, compounds) on the expression or activity of NOV-X in clinical trials. 
These and other agents are described in further detail in the following sections. 

Diagnostic Assays 

An exemplary method for detecting ttie presence or absence of NOV-X in a biological 
sample involves obtaining a biological sample from a test subject and contacting the biological 
sample with a compound or an agent cq)able of detecting NOV-X protein or nucleic acid (e.g., 
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mKNA, genomic DNA) that encodes NOV-X protein such that the presence of NOV-X is 
detected in the biological sample. An agent for detecting NOV-X mRNA or genomic DNA is 
a labeled nucleic acid probe capable of hybridizing to NOV-X mRNA or genomic DNA The 
nucleic acid probe can be, for example, a full-length NOV-X nucleic acid, such as the nucleic 
acid of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57, or a portion thereof such as 
an oligonucleotide of at least 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to 
specifically hybridize under stringent conditions to NOV-X mKNA or genomic DNA. Other 
suitable probes for use in the diagnostic assays of the invention are described herein. 

One agent for detecting NOV-X protein is an antibody cq)able of binding to NOV-X 
protein, preferably an antibody witii a detectable label. Antibodies directed against a protein 
of the invention may be used in methods known witihin the art relating to the localization 
and/or quantitation of the protein (e.g., for use in measuring levels of the protein within 
^ropriate physiological samples, for use m diagnostic methods, for use in imaging the 
protein, and the like). M a givem embodiment, antibodies against the proteins, or derivatives, 
fragments, analogs or homologs fliereo^ that contain the antigen binding domain, are utilized 
as pharmacologically-active con^unds. 

An antibody q>ecific for a protein of the invrajtion can be used to isolate the protein by 
standard techniques, such as immunoafBnity chromatography or jmmunopredpitation. Such 
an antibody can facilitate the purification of the natural protein antigen &om cells and of 
recombinantly produced antigrai expressed in host cells. Moreover, sudi an antibody can be 
used to detect the antigenic protein (e.g., in a cellular lysate or cell supernatant) in order to 
evaluate the abundance and pattern of e:q>ression of the antigenic protein. AnbT>odies directed 
against the protdn can be used diagnostically to monitor protein levels in tissue as part of a 
clinical testing procedure, e.g., to, for exanq>le, determine Ihe efficacy of a given treatment 
regimen. Detection can be facilitated by coiq)ling (i.e., physically linking) the antibody to a 
detectable substance. Exanqiles of detectable substances include various enzymes, prosttietic 
groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive 
materials. Exanq)les of suitable en2ymes include horsa-adish peroxidase, aDcaline 
phosphatase, P-galactosidase, or acetylcholinesterase; examples of suitdjle prosthetic group 
complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent 
materials include umbelliferone, fluorescem, fluoiBscein isoflriocyanate, riiodamine, 
dichlorotriazmylamine fluorescein, dansyl chloride or phycoerythrin; an exanqjle of a 
luminescent material includes luminol; examples of bioluminescent materials include 
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ludferase, ludferin, and aequoriii, and exanq)les of suitable radioactive material include 

Antibodies can be polyclonal, or more preferably, monoclonaL An intact antibody, or 
a fragment thereof (e.g.. Fab or F(ab')2) can be used. The tenn "labeled", with regard to the 
probe or antibody, is intended to encompass direct labeling of the probe or antibody by 
coupling (i.e., physicaDy linking) a detectable substance to the probe or antiT>ody, as well as 
indirect labeling of the probe or antibody by reactivity with another reagent that is directiy 
labeled. Examples of indirect labeling include detection of a primary antibody using a 
fluorescently-labeled secondary antibody and end-labeling of a DNA piobe with biotin such 
that it can be detected with fluorescentiy-labeled sti^tavidin. The term "biological sample" is 
intended to include tissues, cells and biological fluids isolated from a subject, as well as 
tissues, cells and fluids present witiiin a subject. That is, the detection method of the invention 
can be used to detect NOV-X mRNA, protein, or genomic DNA in a biological sample in vitro 
as well as in vivo. For exanq)le, in vib?o techniques for detection of NOV-X mRNA include 
Northern hybridizations and in situ hybridizations. In vitix) techniques for detection of NOV- 
X protein include enzyme linked immunosorbent assays (ELIS As), Western blots, 
immunoprecipitations, and immunofluorescence. In vitro techniques for detection of NOV-X 
genomic DNA inchide Soirthem hybridizations. Furthemiore, in vivo techniques for detection 
of NOV-X protein niclude mtroducing into a subject a labeled anti-NOV-X antibody. For 
exanqjle, the antibody can be labeled with a radioactive marker whose presence and location 
in a subject can be detected by standard imaging techniques. 

hi one embodiment, the biological sanq)le contains protem molecules from the test 
subject. Alternatively, the biological sample can contain mRNA molecules fitan the test 
subject or genomic DNA molecules from the test subject. A prefeired biological sample is a 
peripheral blood leukocyte sample isolated by conventional means from a subject 
In one embodunent, the methods finrther mvolve obtaining a control biological sample from a 
control subject contacting the control sample with a compound or agent capsble of detecting 
NOV-X protein, mRNA, or genomic DNA, sudi fliat the presence of NOV-X protein, mRNA 
or genomic DNA is detected in tiie biological sample, and comparing the presence of NOV-X 
protein, mRNA or genomic DNA in the contix>l san^le with the presence of NOV-X protein, 
mRNA or genomic DNA in the test sample. 

The invention also encompasses kits for detecting the presence of NOV-X in a " 
biological sample. For example, the kit can comprise: a labeled compound or agent csspable of 
detecting NOV-X protein or mRNA in a biological sample; means for detetmining the amount 
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of NOV-X in flie sample; and means for comparing liie amount of NOV-X in the sample with 
a standard. The compound or agent can be packaged in a suitable container. The kit can 
further comprise instructions for using the kit to detect NOV-X protein or nucleic acid. 

Prognostic Assays 

The diagnostic methods described hereip can furthemiore be utilized to identify 
subjects having or at risk of developing a disease or disorder associated with aberrant NOV-X 
expression or activity. For example, the assays described herein, such as the preceding 
diagnostic assays or the following assays, can be utilized to identify a subject having or at risk 
of developing a disorder associated with NOV-X protein, nucleic acid expression or activity. 
Such disorders include for example, disorders of olfectory loss, e.g. trauma, HIV illness, 
neoplastic growth, and neurological disorders, e.g. Parkinson's disease and Alzheimer's 
disease. 

Alternatively, the prognostic assays can be utilized to identify a subject having or at 
risk for developing a disease or disorder. Thus, the invention provides a method for 
identifying a disease or disorder associated with aberrant NOV-X expression or activify in 
which a test sample is obtained fiom a subject and NOV-X protein or nucleic acid (e.g., 
mRNA, genomic DNA) is detected, wherein the presence of NOV-X protein or nucleic add is 
diagnostic for a subject having or at risk of developing a disease or disorder associated with 
aberrant NOV-X expression or activify. As used herein, a "test sample" refers to a biological 
sample obtained fiom a subject of interest. For example, a test sample can be a biological 
fluid (e.g., serum), cell sanq>le, or tissue. 

Furthemiore, the prognostic assays described herein can be used to detannine whether 
a subject can be administered an agent (e.g.. an agonist, antagonist, peptidomimetic, protein, 
peptide, nucleic add, small molecule, or other drug candidate) to treat a disease or disorder 
assodated with aberrant NOV-X expression or activify. For exmtple, such methods can be 
used to determine whether a subject can be effectivefy treated with an agent for a disorder. 
Thus, the invention provides methods for determining whether a subject can be effectively 
treated with an agent for a disorder associated with aberrant NOV-X expression or activify in 
which a test sample is obtained and NOV-X protein or nucldc acid is detected (e.g., wherein 
the presence of NOV-X protein or nucldc acid is diagnostic for a subject that can be 
administered the agent to treat a disorder assodated with aberrant NOV-X expression or 
activify). 

The methods of the invention can also be used to detect genetic lesions in a NOV-X 
gene, thereby determining if a subject with Ihe lesioned gene is at risk for a disorder 
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characterized by abenant ceU proliferation and/or differentiatioiL In variou? embodiments, the 
methods include detecting, in a sanq)le of cells fiom the subject, the presence or absence of a 
genetic lesion characterized by at least one of an alteration affecting the integrity of a gene 
encoding a NOV-X-protein, or the misexpression of the NOV-X gene. For example, such 
genetic lesions can be detected by ascertaining the existence of at least one of: (i) a deletion of 
one or more nucleotides fiom a NOV-X gene; (ii) an addition of one or more nucleotides to a 
NOV-X gene; (iii) a substitution of one or more nucleotides of a NOV-X gene, (iv) a 
chromosomal rearrangement of a NOV-X gene; (v) an alteration in the level of a messenger 
RNA transcript of a NOV-X gene, (vi) aberrant modification of a NOV-X gene, such as of the 
methylation pattern of the genomic DNA, (vii) the presence of anon-wild-type spKcing 
pattern of a messenger RNA transcript of a NOV-X gene, (viii) a non-wUd-type level of a 
NOV-X protefai, (ix) alleUc loss of a NOV-X gene, and (x) inappropriate post-translational 
modification of a NOV-X protein. As described herein, there are a large number of assay 
techniques known in the art which can be used for detecting lesions in a NOV-X gene. A 
preferred biological sample is a peripheral blood leukocyte sample isolated by conventional 
means fiom a subject. However, any biological sample containing nucleated cells may be 
used, including, for exanq>le, buccal mucosal cells. 

In certain anbodiments, detection of the lesion involves the use of a probe/^rimer in a 
polymerase chain reaction (PGR) (see, e.g., U.S. Patent Nos. 4,683.195 and 4,683,202). such 
as anchor PGR or RACE PGR, or, alternatively, in a Ugation chain reaction (LCR) (see, e.g., 
Landegran, et al., 1988. Science 241: 1077-1080; andNakazawa, et al., 1994. Proc. NatL 
Acad. Sci. USA 91 : 360-364), the latter of which can be particularly useful for detecting point 
mutations in the NOV-X-gene (see, Abravaya, et al., 1995. Nucl. Acids Res. 23: 675-682). 
This method can include the slq)s of collecting a sample of cells fiom a patient, isolating 
nucleic acid (e.g., genomic. mRNA or both) fiom the cells of the sample, contacting the 
nucleic add sample with one or more primers that specifically hybridize to a NOV-X gene 
under conditions such that hybridization and amplification of the NOV-X gene (if present) 
occurs, and detecting the presence or absence of an amplification product, or detecting the size 
of the amplification product and comparing the length to a control sample. It is anticipated 
that PGR and/or LGR may be desirable to use as a preliminary amplification stq> in 
conjunction with any of the techniques used for detecting mutations described herein. 

Alternative amplification methods include: self sustained sequence repHcation (see, 
Guatelli, et al., 1990. Proc. Natl. Acad. Sci. USA 87: 1874-1878). transcriptional amplification 
system (see, Kwoh, et aL, 1989. Proc. Natl. Acad. Sci. USA 86: 1 173-1 177); Qp RepKcase 
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(see,LizanJi,etaI, 1988. BioTechnology 6: 1197), or any other nucleic acid amplification 
method, followed by the detection of the anq)hfied molecules using techniques weU known to 
those ofskiU in the art These detection schemes are especially useful for the detection of 
nucleic acid molecules if such molecules are present in very low numbers. 

hi an alternative embodiment, mutations in a NOV-X gene from a sample ceU can be 
identified by alterations in restriction enzyme cleavage patterns. For example, sample and 
control DNA is isolated, amplified (optionally), digested with one or more restriction 
endonucleases, and fragment length sizes are determined by gel electrophoresis and compared. 
Differences in fragment length sizes between sample and control DNA indicates mutations in 
flie sample DNA. Moreover, the use of sequence specific ribozymes (see, e.g.. US. Patent 
No. 5,493,531) can be used to score for the presence of specific mutations by development or 
loss of a ribozyme cleavage site. 

In other embodiments, genetic mutations in NOV-X can be identified by hybridizing a 
sample and control nucleic adds, e.g., DNA or RNA, to high-density arrays containing 
hundreds orthousands of oUgonucleotidesprobes. See, e.g., Cronin, et al., 1996. Human 
Mutation 7: 244-255; Kozal. et al., 1996. Nat Med. 2: 753-759. For example, genetic 
mutations in NOV-X can be identified in two dimensional arrays containing light-generated 
DNA probes as described in Cronin, et aL, supra. Briefly, a first hybridization array of probes 
can be used to scan tiirough long stretches of DNA in a sample and contiol to identify base 
changes between the sequences by making linear arrays of sequential overlapping probes. 
This step allows the identification of point mutations. This is followed by a second 
hybridization array that allows the characterization of specific mutations by using smaller, 
spedalizedprobe arrays conq)lementarytoallvariantsormutatiQns detected. Eachmutation 
array is composed of paraUel probe sets, one complementary to the wild-type gene and the 
other conq)lementary to the mutant gene. 

In yet another embodiment, any of a variety of sequencing reactions known in the art can be 
used to directiy sequence the NOV-X gene and detect mutations by comparing tiic sequence of 
the sample NOV-X with flie corresponding wild-type (contiol) sequence. Examples of 
sequencing reactions include those based on techniques developed by Maxim and Gilbert, 
1977. Proc. Nafl. Acad. Sci. USA 74: 560 or Sanger, 1977. Proc. Nati. Acad. Sci. USA 74: 
5463. It is also contemplated tiiat any of a variety of automated sequencing procedures can be 
utilized when performing the diagnostic assays (see, e.g., Naeve, et al., 1995. Biotechniques 
19: 448), including sequencing by mass spectrometiy (see, e.g.. PCT International PubUcation 
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No. WO 94/16101; Cohen, et al., 1996. Adv. Chromatography 36: 127-162; and Griffin, et al.. 
1993. Appl. Biochem. Biotechnol 38: 147-159). 

Other methods for detecting mutations in the NOV-X gene include methods in which 
protection from cleavage agents is used to detect mismatched bases in RNA/RNA or 
RNA/DNA heteroduplexes. See, e.g., Myers, et al., 1985. Science 230: 1242. In general, the 
art technique of "mismatch cleavage" starts by providing heterodq,lexes of fomied by 
hybridizing Gabeled) RNA or DNA containing the wild-type NOV-X sequence with 
potentiallymutantRNAorDNAobtainedfiomatissuesample. The double-stranded 
duplexes are tieated with an agent that cleaves singlc^stranded regions of the duplex such as 
which win exist due to basepair mismatches between the control and sample strands. For 
instance, RNA/DNA duplexes can be treated with RNase and DNA/DNA hybrids treated with 
S, nuclease to enzymaticallydigestingthemismatchedregions. In other embodiments, either 
DNA/DNA or RNA/DNA duplexes can be treated with hydroxylamine or osmium telioxide 
andwithpiperidineinordertodigestmismatchedregions. After digestion of the mismatched 
regions, the resulting material is then separated by size on denaturing polyacrylamide gels to 
detemune the site of mutation. See, e.g.. Cotton, et al., 1988. Proc. Natl Acad. Sci. USA 85: 
4397; Saleeba, et al., 1992. Mefliods Enzymol. 217: 286-295. In an embodiment, the control 
DNA or RNA can be labeled for detection. 

In still another embodiment, the mismatch cleavage reaction ranploys one or more 
proteins that recognize mismatched base pairs in double-stranded DNA (so called "DNA 
mismatch repair" enzymes) in defined systems for detecting and m^ing point mutations in 
NOV-X cDNAs obtained fiom samples of cells. For example, themutY enzyme of E. coli 
cleaves A at G/A mismatches and the thymidine DNA glycosylase fiom HeU cells cleaves T 
at G/T mismatches. See, e.g., Hsu. et al., 1994. Carcinogenesis 15: 1657-1662. According to 
an exenq)laiy embodiment, a probe based on a NOV-X sequence, e.g., a wild-type NOV-X 
sequence, is hybridized to a cDNA or other DNA product fiom a test cell(s). Ihe duplex is 
treated witii a DNA mismatch repair enzyme, and the cleavage products, if any, can be 
detected fiom electrophoresis protocols or tiie Uke. See, e.g., U.S. Patent No. 5,459,039. 

In other embodiments, alterations in eledrophoretic mobility will be used to identify 
mutations in NOV-X genes. For example, single stiand conformation polymorphism (SSCP) 

may be used to detect differences in electrophoretic mobiUty between mutant and wild type 
nucleic acids. See, e.g., Orita. et al., 1989. Proc. Natl. Acad. Sci. USA: 86: 2766; Cotton, 
1993. Mutat Res. 285: 125-144; Hayashi. 1992. Genet Anal. TecL AppL 9: 73-79. 
Single-stranded DNA fragments of sample and control NOV-X nucleic acids will be denatiired 
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and allowed to renature. The secondary structure of single-sfranded nucleic acids varies 
according to sequence, the resulting alteration in electrophoretic mobility enables the detection 
of even a single base change. The DNA fragments may be labeled or detected with labeled 
probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in 
which the secondary structure is more sensitive to a change in sequence. In one embodiment, 
the subject method utilizes hetaodnplex analysis to separate double stranded heteroduplex 
molecules on the basis of changes in electrophoretic mobiHty. See, e.g.. Keen, et al., 1991. 
Trends Genet 7: 5. 

In yet another embodiment, the movement of mutant or wild-type fiagments in 
polyactylamide gels containing a gradient of denaturant is assayed using denaturing gradient 
gel electrophoresis (DGGE). See, e.g., Myers, et al., 1985. Nature 313: 495. When DGGE is 
used as the method of analysis, DNA will be modified to insure that it does not completely 
denature, for example by adding a GC clamp of ^proximately 40 bp of high-melting GC-rich 
DNA by PCR. In a further embodiment, a temperature gradient is used in place of a 

denaturing gradient to identify differences in the mobiKtyofcontrol and sample DNA. See, 
e.g., Rosenbaum and Rdssner, 1987. Biophys. Chem 265: 12753. 

Examples of other techniques for detecting point mutations include, but are not limited 

to, selective oKgonucleotide hybridization, selective amplification, or selective primer 

extension. For example, oKgonucleotide primers may be prepared in which the known 

mutation is placed centrally and then hybridized to target DNA under conditions that permit 

hybridization only ifaperfect match is found. See, e.g., Saiki, et al.. 1986. Nature 324: 163; 

Saiki, et al., 1989. Proc. Natl. Acad. Sci. USA 86: 6230. Such allele specific oligonucleotides 

are hybridized to PGR amplified target DNA or a number of different mutations whai the 

oligonucleotides are attached to the hybridizing membrane and hybridized with labeled target 



Alternatively, aflele specific amplification technology that depends on selective PGR 
amplification may be used in conjunction with the instant invention. Oligonucleotides used as 
primers for specific amplification may carry the mutation of interest in the center of the 
molecule (so that amplification depends on differential hybridization; see, e.g., Gibbs, et aL, 
1989. NucL Adds Res. 17: 2437-2448) or at the extreme 3'.temiinusofone primer where, ' 
under ^ropriate conditions, mismatch can prevent, or reduce polymerase extension (see, e.g., 

Prossner, 1993. Tibtech. 11:238). Inadditionitmaybedesirabletointroduceanovel 
restriction site in the region of the mutation to create cleavage-based detection. See, e.g., 
Gasparini, et aL, 1992. MoL Cell Probes 6: 1. It is anticipated that in certain embodiments 
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amplification may also be performed using Taq Kgase for amplification. See, e.g.. Barany, 
1991. Proc. Natl. Acad Sci. USA 88: 189. In such cases, Ugation will occur only if fliere is a 
perfect match at the 3'-tenninus of the 5* sequence, making it possible to detect the presence of 
a known mutation at a specific site by looking for the presence or absence of amplification. 

The methods described herein may be performed, for exanq>le, by utilizing 
pre-packaged diagnostic kits con5>rising at least one probe nucleic add or antibody reagent 
described herein, which may be conveniently used, e.g., in clinical settings to diagnose 
patients exhibiting symptoms or femily history of a disease or ilhiess involving a NOV-X 
gene. 

Furthermore, any cell type or tissue, preferably peripheral blood leukocytes, in which 
NOV-X is expressed may be utilized in Ihe prognostic assays described herein. However, any 
biological sample containing nucleated ceUs may be used, including, for example, buccal 
mucosal cells. 



Pharmacogenomics 

Agents, or modulators that have a stimulatory or inhibitory eflfect on NOV-X activity 
(e.g., NOV-X gene expression), as identified by a screening assay described herein can be 
administered to individuals to treat (prophylactically or therapeutically) disorders (e.g. 
disorders of olfactory loss, e.g. trauma, HIV ilhiess, neoplastic growth, and neurological 
disorders, e.g. Paridnson's disease and Alzheimer's disease). In conjunction with such 
treatment, the phaimacograomics (i.e., the study of the relationship between an individual's 
genotype and that individual's response to a foreign compound or drug) of the individual may 
be considered. Differences in metaboUsm of ther^utics can lead to severe toxicity or 
therapeutic failure by altering the relation between dose and blood concentration of the 
pharmacologically active drug. Thus, the pharmacogenomics of the individual permits the 
selection of effective agents (e.g., drugs) for prophylactic or tiierapeutic treahnents based on a 
consideration of the individual's genotype. Such pharmacogenomics can fiirther be used to 
determine appropriate dosages and therapeutic regimens. Accordingly, the activity of NOV-X 
protein, expression of NOV-X nucleic add, or mutation content of NOV-X genes in an 
individual can be determined to thereby select appropriate agent(s) for therapeutic or 
prophylactic treatment of the individual 

Pharmacogenomics deals with chnicaUy significant hereditary variations m the 
response to drugs due to altered drug disposition and abnormal action in affected persons. See 
e.g., Eichelbaum, 1996. Clin. Exp. Pharmacol. Physiol., 23: 983-985; Linder, 1997. Clin. 
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Chem., 43: 254-266. In general, two types of phaimacogenetic conditions can be 
differentiated. Genetic conditions transmitted as a single fector altering the way drugs act on 
the body (altered drug action) or genetic conditions transmitted as single fectors altering the 
way Oie body acts on drugs (altered drug metaboUsm). These phannacogenetic conditions can 
occur d&er as rare defects or as polymorphisms. For example, glucose^phosphate 
dehydrogenase (G6PD) deficiency is a common inherited enzymopathy in which the main 
clinical complication is hemolysis after ingestion of oxidant drugs (anti-malarials. 
sulfonamides, analgesics, nitro&rans) and consumption of feva beans. 

As an iUustrative embodiment, the activity of drug metabolizing enzymes is a major 
determinant of both the intensity and duration of drug action. The discovery of genetic 
polymorphisms of drug metabolizing enzymes (e.g., N-acetyhransferase 2 (NAT 2) and 
cytochrome P450 enzymes CYP2D6 and CYP2C19) has provided an «q,lanation as to why 
some patients do not obtain the expected drug effects or show exaggerated drug i^onse and 
serious toxicity after takmg the standard and safe dose of a drug. These polymorphisms are 
expressed in two phenotypes in the population, the extensive metabolizer (EM) and poor 
metabolizer(PM). The prevalence of PM is different among different populations. For 
example, the gene coding for CYP2D6 is higjdy polymorphic and several mutations have been 
identified in PM, which alllead to the absence of functional CYP2D6. Poor metabolizers of 
CYP2D6 and CYP2C19 quite fiequently experience exaggerated drug response and side 
effects when Ihey receive standard doses. If a metabolite is the active therapeutic moiety, PM 
show no therq)eutic response, as demonstrated for the analgesic effect of codeme mediated by 
its CYP2D6-formed metabolite morphine. At the other extreme are the so caUed ultra-r^id 
metabolizerswho do notrespond to standard doses. Recently, the molecular basis of 
ultra-rapid metabolism has been identified to be due to CYP2D6 gene amplificatioa 

Thus, the activity of NOV-X protein, expression of NOV-X nucleic add, or mutation 
content of NOV-X genes in an individual can be detemrined to thereby select appropriate 
agent(s) for therapeutic or prophylactic treatment of thfe mdividual. In addition, 
phaimacogenetic studies can be used to apply genotyping of polymorphic aUeles encoding 
drug-metabolizing enzymes to the identification of an individual's drug responsiveness 
phenotype. This knowledge, when applied to dosing or drag selection, can avoid adverse 
reactions or Iher^eutic feihire and flius enhance therapeutic or prophylactic efficiency when 
treating a subject with a NOV-X modulator, sudi as a modulator identified by one of the 
exemplary screening assays described herein. 
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Monitoring of Effects During Clinical Trials 

Monitoring the mfluence of agents (e.g., drugs, compounds) on the ejq>iession or 
activity of NOV-X (e.g., the ability to modulate aberrant cell proliferation) can be appUed not 
only in basic drug screening, but also in clinical trials. For example, the effectiveness of an 
agent determined by a screening assay as described heran to increase NOV-X gene 
expression, protein levels, or i^regulate NOV-X activity, can be monitored in clinical trails of 
subjects exhibiting decreased NOV-X gene expression, protein levels, or downregulated 
NOV-X activity. Alternatively, the effectiveness of ao agent determined by a screening assay 
to decrease NOV-X gene expression, protein levels, or downregulate NOV-X activity, can be 
monitored in clinical trails of subjects exhibiting increased NOV-X gene expression, protein 
levels, or upregulated NOV-X activity. In such clinical trials, the expression or activity of 
IsfOV-X and, preferably, other genes that have been implicated in, for example, a ceDular 
proliferation or immune disorder can be used as a "read out" or markers of the immune 
responsiveness of a particular cell. 

By way of example, and not of limitation, genes, including NOV-X, that are modulated 
in ceUs by treatment with an agent (e.g., compound, drug or small molecule) that modulates 
NOV-X activity (e.g., idaitified in a screening assay as described herein) can be identified. 
Thus, to study the effect of agents on celhilar proliferation disorders^ for example, in a clinical 
trial, cells can be isolated and RNA prepared and analyzed for the levels of expression of 
NOV-X and other genes impUcated in the disorder. The levels of gene expression (i.e., a gene 
expression pattern) can be quantified by Northern blot analysis or RT-PCR, as described 
herein, or alternatively by measuring the amount of protein produced, by one of the methods 
as described herein, or by measuring the levels of activity of NOV-X or other genes. lii this 
manner, the gene expression pattern can serve as a maricer, indicative of the physiological 
response of the cells to the agent. Accordingly, this response state may be determined before, 
and at various points during, treatment of the individual with the agent 

In one embodiment, the invention provides a method for monitoring the effectiveness 
of treatment of a subject with an agent (e.g., an agonist, antagonist, protein, peptide, 
peptidomimetic, nucleic acid, small molecule, or other drug candidate identified by the 
screening assays described herein) comprising the steps of (i) obtaining a pr&-administration 
sample fiom a subject prior to administration of the agent; (ii) detecting the level of expression 
of a NOV-X protein, mRNA, or genomic DNA in the preadministration sample; (iii) obtaining 
one or more post-administration samples Scorn the subject; (iv) detecting the level of 
expression or activity of tiie NOV-X protein, mRNA, or genomic DNA in flie 
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post-administration samples; (v) conq)aring the level of expression or activity of flie NOV-X 
protein, mRNA, or genomic DNA in the pre-administiation sample with the NOV-X protein, 
mRNA, or genomic DNA in the post administration sanqile or san5)les; and (vi) altering the 
administration of the agent to the subject accordingly. For example, increased administration 
of the agent may be desirable to increase the expression or activity of NOV-X to higher levels 
than detected, i.e., to increase the effectiveness of the agent Alternatively, decreased 
administration of the agent may be desirable to decrease expression or activity of NOV-X to 
lower levels than detected, i.e., to decrease the effectiveness of the agent. 

Methods of Treatment 

The invention provides for both prophylactic and therapeutic methods of treating a 
subject at risk of (or susceptible to) a disorder or having a disonier associated with aberrant 
NOV-X expression or activity. Disorders associated with aberrant NOV-X expression includei 
for example, disorders of olfactory loss, e.g. trauma, HIV ilMess, neoplastic g^ovfih, and 
neurological disorders, e.g. Parkinson's disease and Alzheimer's disease. 

These methods of treatment will be discussed more fiilly, below. 

Disease and Disorders 

Diseases and disorders that are characterized by increased (relative to a subject not 
suffering from the disease or disorder) levels or biological activity may be treated with 
Therapeutics that antagonize (i.e., reduce or inhibit) activity. Therapeutics that antagonize 
activity may be administered ill a ther^ticorprophylactic manner. Therapeutics that may 
be utihzed inchide, but are not limited to: (i) an aforementioned peptide, or analogs, 
derivatives, fragments or homologs thereof, (u) antibodies to an aforementioned peptide; (iii) 
nucleic adds encodmg an aforementioned pq)tide; (iv) administration of antisense nucleic 
acid and nucleic acids that are "dysfunctional" (i.e., due to a heterologous msertion within the 
coding sequences of coding sequences to an aforementioned peptide) that are utihzed to 
"knockout" endogenous fimction of an aforementioned peptide by homologous recombmation 
(see, e.g., Capecchi, 1989. Science 244: 1288-1292); or (v) modulators ( i.e., inhibitors, 
agonists and antagonists, includmg additional peptide mimetic of the invention or antibodies 
specific to a peptide of the invention) that alter the interaction between an aforementioned 
peptide and its binding partner. 

Diseases and disorders that are characterized by decreased (relative to a subject not 
suffering from the disease or disorder) levels or biological activity may be treated with 
Therapeutics that mcrease (i.e., are agonists to) activity. Ther^eutics that upregulate activity 
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may be administered in a therapeutic or prophylactic manner. Therapeutics that may be 
utilized inchide, but are not limited to, an aforementioned peptide, or analogs, derivatives, 
fragments or homologs thereof; or an agonist that increases bioavailability. 

iDcreased or decreased levels can be readily detected by quantifying peptide and/or 
RNA, by obtaining a patient tissue sanq>le (e.g., fixMn biopsy tissue) and assaying it in vitro for 
RNA orpeptide levels, structure and/or activity of flie expressed peptides (ormRNAs of an 
aforementioned peptide). Methods that are well-known within Ihe art include, but are not 
limited to, immunoassays (e.g., by Western blot analysis, immunopredpitation followed by 
sodium dodecyl sulfite (SDS) polyaciylamide gel electrophoresis, immunocytochemistiy, etc.) 
and/or hybridization assays to detect expression of mRNAs (e.g.. Northern assays, dot blots, in 
situ hybridization, and the like). 

Prophylactic Methods 

In one aspect, the invention provides a method for preventing, in a subject, a disease or 
condition associated with an aberrant NOV-X expression or activity, by administering to the 
subject an agent that modulates NOV-X expression or at least one NOV-X activity. Subjects 
at risk for a disease that is caused or contributed to by aberrant NOV-X ejqpressian or activity 
can be identified by, for example, any or a combination of diagnostic or prognostic assays as 
described hereuL Administration of a prophylactic agent can occur prior to the manifestation 
of symptoms characteristic of the NOV-X aberrancy, such that a disease or disorder is 
prevented or, alternatively, delayed in its progression. Depending xtpon the type of NOV-X 
aberrancy, for example, a NOV-X agonist or NOV-X antagonist agent can be used for treating 
flie subject The ^propriate agent can be determined based on screening assays described 
herein. The prophylactic mefihods of the invention are further discussed in the following 
subsections. 

Therapeutic Methods 

Another aspect of the invention pertains to methods of modulating NOV-X raqiression 
or activity for therapeutic purposes. The modulatory method of the invention involves 
contacting a cell with an agent that modulates one or more of the activities of NOV-X protein 
activity associated with the ceU. An agent that modulates NOV-X protein activity can be an 
agent as described herein, such as a nucleic acid or a protdn, a naturally-occurring cognate 
Ugand of a NOV-X protein, apqjtide, aNOV-Xpeptidomimetic, or other small molecule. In 
one embodiment, the agent stimulates one or more NOV-X protein activity. Bxamtples of such 
stimulatory agents include active NOV-X protein and a nucleic acid molecule encoding NOV- 
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X that has been introduced into the ceU. In another embodiment, Ihe agent inhibits one or 
moreNOV-Xproteinactivily. Examples of such inhibitory agents include antisense NOV-X 
nucleic acidmolecules and anti-NOV-X antibodies. These modulatory methods can be 
performed in vitro (e.g., by culturing the cell with the agent) or, alternatively, in vivo (e.g.. by 
admimstering the agent to a subj ect). As such, the invention provides methods of treating an 
individual afflicted with a disease or disorder characterized by aberrant expression or activity 
of a NOV-X protein or nucleic add molecule. In one embodiment, the method involves 
admimstering an agent (e.g.. an agent identified by a scieening assay described herein), or 
combination of agents that modulates (e.g.. up-regulates or down-regulates) NOV-X 
expression or activity. In another embodiment, the method involves admimstering aNOV-X 
protein or nucleic acid molecule as therapy to compaisate for reduced or abenant NOV-X 
e:q>ression or activity. 

Stimulation of NOV-X activity is desirable in situations in which NOV-X is 
abnormally downregulated and/or in which increased NOV-X activity is likely to have a 
15 beneficial effect One example of such a situation is where a subject has a disorder 

characterized by abenant ceU proliferation and/or differentiation (e.g., c^^ 
associated ). Another example of such a situation is where the subject has an 
immunodeficiency disease (e.g., AJDS). 

Antibodies of the invention, including polyclonal, monoclonal, humanized and fuUy 
humanantibodies,m^usedasther^ticagents. Such agents will generally be employed lo 
treat or prevent a disease or pathology in a subject An antibody prqjaration, pref^ly one 

having high specificity and high affinity for its target antigen, is administered to the ^^^^ 
and wiU genaally have an effect due to its binding with the target Such an effect may be one 
of two kinds, depending on the specific nature of the mteraction between the given antibody 
molecule and the target antigen in question. In the first mstance, administration of the 
antTxxiy may abrogate or inhibit the binding of the target with an endogenous Hgand to which 
it natuiaUy binds. In this case, the antiTx>dy binds to the target and masks a binding site of the 

naturallyoccuiringligand,whereinlheligandservesa5ane£fectormolecule. Thusthe 
receptor mediates a signal transduction pafliway for which Kgand is responsible. 

Alternatively, the effect may be one in which the antibody eUcits a physiological result 
by virtue of binding to an effector binding site on the target molecule. In tins case the target, a 
recq>tor having an endogenous ligand which may be absent or defective in tiie disease or 
pathology, binds the antibody as a surrogate effector ligand, initiating a receptor-based signal 
transduction event by the receptor. 
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A therapeuticaUy effective amount of an antflxxiy of the invention relates generally to 
the amount needed to achieve a therapeutic objective. As noted above, this may be a binding 
interaction between the antiTxniy and its target antigen that, in certain cases, interferes with the 
functioning of the target, and in other cases, promotes a physiological response. The amount 
lequired to be administered will furthermore depend on flie binding afSnity of the antibody for 
its specific antigen, and will also depend on the rate at which an administered antiTwdy is 
depleted &oin the free volume otiier subject to which it is administered. Common ranges for 
therapeuticaUy effective dosing of an antibody or antibody fragment of the invention may be, 
by way of nonlimiting example, from about 0. 1 mg/kg body weight to about 50 mg/kg body 
weight. Common dosmg frequencies may range, for example, from twice daily to once a 
week. 



Determination of the Biological Eflfect of the Therapeutic 

In various embodiments of the invention, suitable in vitro or in vivo assays are 
15 performed to determine flie effect of a q«cific Ther^tic and wheflier its administration is 
indicated for tireabment of flie affected tissue. 

In various specific embodiments, in vitro assays may be performed with representative 
cells of flie type(s) involved in flie patienf s disorder, to determine if a given Therapeutic exerts 
flie desired effect upon flie ceU type(s). Compounds for use in flierapy may be tested in 
suitable animal model systems including, but not limited to rats, mice, chicken, cows, 
monkeys, rabbits, and flie like, prior to testing in human subjects. Similarly, for in vivo 
testing, any of flie animal model system known in flie art may be used prior to administration 
to human subjects. 



The invaition will be farther described in flie following examples, wMch do not limit 
flie scope of ttie invention described in flie claims. 



Examples 

Example 1: Quantitative ExDres.sinn Analysis of NOV-l. NOV-2 NOV-3. anH TsTOV-A in 
30 various cells and tissues 

RTQ-PCR Panel Descriptions: 
Panel 1 

As shown in ttie expression data in Tables 39, 40, and 41, Panel 1 of each table is 
composed of RNA or cDNA isolated fiom various human cells or ceU lines ftom normal and 
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cancerous tissue. These cells and cell lines have been extensively characterized by 
investigators in both academia and the commercial sectorregarding their tumorgenidty, 
metastatic potential, drug resistance, invasive potential, and other cancer-related properties. 
They serve as suitable tools for pre-clinical exvaluation of anti-cancer agents and promising 
ther^utic strategies. 

Panel 2: 

In Tables 39, 40, and 41, Panel 2 of each table includes 2 control wells and 94 test 
samples composed of RNA or cDNA isolated fiom human tissue procured by surgeons 
working in close cooperation with the National Cancer Institute's Cooperative Human Tissue 
Network (CHTN) or the National Disease Research Initiative (NDRl). The tissues are derived 
from human malignancies and in cases where indicated, many malignant tissues have 
"matched margins", which is non-cancerous tissue adjacent to the tumor. These are termed 
nonnal adjacent tissues and are denoted "NAT" in Tables 39, 40, and 41 . The tumor tissue 
and the matched margins are evaluated by two independent pathologists at NDRI or CHTN. 
This analysis provides a gross histopathological assessment of timior differentiation grade. 
Moreover, most samples include the original surgical pathology report tiiat provides 
information regarding the clinical stage of the patient In addition, these RNA and cDNA 
samples were obtamed from various human tissues derived fiom autopsies performed on 
elderly people or sudden death victims (accidents, etc.). These tissue were ascertained to be 
free of disease and were purchased fiom various commercial sources such as Clontech (Palo 
Alto, CA), Research Genetics, and Invitrogen. 

RNA integrity fiom all samples is controlled fijr quality by visual assessment of 
agarose gel electropherograms using 28S and 18S ribosomal RNA staining intensity ratio as a 
guide (2:1 to 2.5:1 28s:18s) and the absence of low molecular weigjit RNAs that would be 
indicative of degradation products. Samples are controlled against genomic DNA 
contamination by RTQ PGR reactions run in tiie absence of reveise transcriptase using probe 
and primer sets designed to amplify across the span of a smgle exoa 

Panel 3: 

Panel 3 in Tables 39, 40, and 41, include samples on a 96 well plate (2 contiol wells, 
94 test samples) composed of RNA or cDNA isolated fiom various human cell lines or tissues 
related to inflammatory conditions. Total RNA &om control normal tissues such as colon and 
lung (Stiatagene, La JoUa, CA) and thymus and kidney (Clontech) were employed. Total 
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UNA from liver tissue fiom cirrhosis patients aad kidni^ from lupus patients was obtained 
fiom BioChain (Biochain Institute, Inc., Hayward, CA). Intestinal tissue for SNA preparation 
fiom patients diagnosed as having Crohn's disease and ulcerative colitis was obtained fiom the 
National Disease Research Interchange (NDRI) (Philadelphia, PA). 

Astrocytes, lung fibroblasts, dermal fibroblasts, coronary artery smooth muscle cells, 
small airway epithelium, bronchial epitheUum, microvascular dermal endothehal cells, 
microvascular lung endothelial cells, human puhnonary aortic endothelial cells hiunan 
umbilical vein endothelial cells were all purchased fiom Clonetics (Walkersville, MD) and 
grown in the media supplied fi>r these cell types by Qonetics. These primary cell types were 
activated with various cytokines or combinations of cytokines for 6 and/or 12-14 hours, as 
indicated. The following cytokines were used; IL-1 beta at ^proximately 1-5 ng/tal, TNF 
alpha at ^proximately 5-10 ng/ml, IFN gamma at proximately 20-50 ng/ml, IL-4 at 
approximately 5-10 ng/ml, IL-9 at approximately 5-10 ng/ml, IL-13 at q)proximately 5-10 
ng/ml. Endothelial cells were sometimes starved for various times by culture in the basal 
media fiom Clonetics with 0.1% 

Mononuclear cells were pr^ared fiom blood of employees at CuiaGen Coiporation, 
using FicoU. LAK cells were prepared fiom these cells by culture in DMEM 5% FCS 
(Hyclone), 100 \iM non essential amino adds (Gibco/Life Technologies, Rockville, MD), 1 
mM sodium pyruvate (Gibco), mercaptoethanol 5.5 x lO"' M (Gibco), and 10 mM Hepes 
(Gibco) andInterleuldn2for4-6days. Cells were then sather activated with 10-20 ng/ml 
PMA and 1-2 ^gAnl ionomycin, IL-12 at 5-10 ng/ml, IFN gamma at 20-50 ng/ml and IL-1 8 at 
5-10 ng/ml for 6 hours. In some cases, mononuclear cells were cultured for 4-5 days in 
DMEM 5% FCS (Hyclone), 100 jiM non essential amino acids (Gibco), 1 mM sodium 
pymvate (Gibco), mercaptoethanol 5.5 x 10"^ M (Gibco), and 10 mM Hepes (Gibco) with 
PHA (phytohemagghitinin) or PWM ^keweed mitogen) at ^proximately 5 yLgfml. Samples 
were taken at 24, 48 and 72 hours for KNA preparation. MLR (mixed lymphocyte reaction) 
samples were obtained by taking blood fiom two donors, isolating the mononuclear cells using 
Ficoll and mixing flie isolated mononuclear cells 1 :1 at a final concenlxation of proximately 
2x10** cells/ml in DMEM 5% FCS (Hyclone), 100 nM non essential amino acids (Gibco), 1 
mM sodium pyruvate (Gibco), mercaptoethanol (5.5 x 10 * M) (Gibco), and 10 mM Hepes 
(Gibco). The MLR was cultured and samples taken at various time points ranging from 1- 7 
days for RNA preparation. 

Monocytes were isolated fix)m mononuclear cells using CD14 Miltenyi Beads, +ve VS 
selection columns and a Vario Magnet according to Hie manufacturra-'s instructioiis. 
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Monocytes were differentiated into dendritic cells by culture in DMEM 5% fetal calf senun 
(FCS) (Hyclone, Logan, UT), 100 non essential amino adds (Gibco), 1 mM sodium 
pyravate (Gibco), mercqitoethanol 5.5 x 10"^ M (Gibco), and 10 mM Hepes (Gibco), 50 ng^ 
CMCSF and 5 ng/ml IL-4 for 5-7 days. Macrophages were prepared by culture of monocytes 
for 5-7 days in DMEM 5% FCS (Hyclone), 100 jiM non essential amino acids (Gibco), 1 mM 
sodium pyruvate (Gibco), mercaptoethanol 5.5 x 10"^ M (Gibco), 10 mM Hepes (Gibco) and 
10% AB Human Serum or MCSF at approximately 50 ng/ml. Monocytes, macrophages and 
dendritic cells were stimulated for 6 and 12-14 hours with Kpopolysaccharide (LPS) at 100 
ng/ml. Dendritic cells were also stimulated with anti-CD40 monoclonal antibody 
(Phanningen) at 10 ^g/ml for 6 and 12-14 hours. 

CD4 lymphocytes, CDS lymphocytes and NK cells were also isolated fiom 
mononuclear cells usmg CD4, CD8 and CD56 Miltenyi beads, positive VS selection columns 
and a Vario Magnet according to Ihe manufectura-'s instructions. CD45RA and CD45RO CD4 
lymphocytes were isolated by depleting mononuclear cells of CDS, CD56, CD14 and CD19 
ceUs using CDS. CD56, CD14 and CD19 Miltenyi beads and +ve selection. Then CD45RO 
beads were used to isolate the CD45RO CD4 lymphocytes with the remaining cells being 
CD45RA CD4 lymphocytes. CD45RA CD4, CD45RO CD4 and CDS lymphocytes were 
placed in DMEM 5% FCS (Hyclone), 100 jiM non essential amino acids (Gibco), 1 mM 
sodium pyruvate (Gibco), mercaptoethanol 5.5 x lO'^ M (Gibco), and 10 mM Hepes (Gibco) 
and plated at 10* cells/ml onto Falcon 6 well tissue culture plates that had been coated 
overnight with 0.5 ng/ml anti-CD2S (Phanningen) and 3 ug/od anti-CD3 (OKT3, ATCQ in 
PBS. After 6 and 24 hours, the ceUswere harvested for RNA preparation. To prepare 
chronically activated CDS lynq)hocytes, we activated the isolated CDS lymphocytes for 4 days 
on anti-CD2S and anti-CD3 coated plates and then harvested the cells and expanded them in 
DMEM 5% FCS (Hyclone), 100 yM non essential amino adds (Gibco), 1 mM sodium 
pyravate (Gibco), merc^toelhanol 5.5 x lO'* M (Gibco), and 10 mM Hepes (Gibco) and IL-2. 
The expanded CDS cells were then activated again with plate bound anti-CD3 and anti-CD28 
for 4 days and expanded as before. RNA was isolated 6 and 24 hours after the second 
activation and after 4 days of the second expansion culture. The isolated NK cells were 
cultured in DMEM 5% FCS (Hyclone), 100 jiM non essential amino acids (Gibco), 1 mM 
sodium pyruvate (Gibco), mercaptoethanol 5.5 x 10"^ M (Gibco), and 10 mM Hepes (Gibco) 
and IL-2 for 4-6 days before KNA was prepared. 

To obtain B cells, tonsils were procured fix>m NDRI. The tonsU was cut up with sterile 
dissecting sdssors and then passed through a sieve. TonsU cells were then spun down and 
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resupended at 10« cellsAnI in DMEM 5% FCS (Hyclone), 100 yM non essential amino acids 
(Gfljco), 1 mM sodium pyruvate (Gibco), mercaptoethanol 5.5 x lO'^ M (Gibco), and 10 mM 
Hepes (Gibco). To activate the cells, we used PWM at 5 fig/ml or anti-CD40 (Phanningen) at 
approximately 1 0 MgM and IL^ at 5-1 0 ng/mL Cells were harvested for RNA preparation at 
24,48 and 72 hours. 

To prepare the primary and secondary Thl/Th2 and Trl cells, six-weU Falcon plates 
were coated overnight with 10 ng/ml anti-CD28 (Phamiingen) and 2 jig/ml 0KT3 (ATCC), 
and then washed twice with PBS. Umbilical cord blood CD4 lymphocytes (Poietic System^, 
German Town, MD) were cultured at 10 -10 cells/ml in DMEM 5% FCS (Hyclone), 100 jiM 
non essential amino acids (Gfljco), 1 mM sodium pyruvate (Qibco), mercaploelhanol 5.5 x 10" 
'M (Gibco). 10 mM Hepes (Gibco) and IL-2 (4 hg/ml). IL-12 (5 ng/ml) and anti-IL4 (1 
Mg/ml) were used to direct to Thl . while EM (5 ng/ml) and anti-IFN gamma (1 jig/ml) were 
used to direct to Th2 and IL-10 at 5 ng^il was used to direct to Trl. After 4-5 days, the 
activated Thl, Th2 and Trl lymphocytes were washed once in DMEM and expanded for 4-7 
days in DMEM 5% FCS (Hyclone). 1 00 non essential amino adds (Gibco), 1 mM sodium 
pyruvate (Gibco), mercq)toethanol 5.5 x lO'^ M (Gibco), 10 mM Hepes (Gibco) and IL-2 (1 
ng^). FoUowing this, the activated Thl, Th2 and Trl lymphocytes were re-stimulated for 5 
days with anti-CD28/OKT3 and cytokines as described above, but with Ihe addition of anti- 
CD95L (1 ^g/ml) to prevent apoplosis. After 4-5 days, the Thl, Th2 and Trl lymphocytes 
were washed and then expanded again with IL-2 for 4-7 days. Activated Thl and Th2 
lymphocytes were maintained in this way for a maximum of three cycles. RNA was prepared 
fiom primary and secondary Thl . Th2 and Trl after 6 and 24 hours following the second and 
third activations with plate bound anti-CD3 and anti-CD28 mAbs and 4 days into the second 
and third ejq>ansion cultures in Interleuldn 2. 

The following leukocyte cells Imes were obtained from the ATCC: Ramos, EOL-1, 
KU-812. EOL ceUs were fiirther differentiated by culture in 0.1 mM dbcAMP at 5 xlO^ 
cellsAnl for 8 days, changing the media every 3 days and adjusting the cell concentration to 5 
xlO^ cells/ml. For die culture oftheseceUs, we used DMEM or RPMI (as recommended by 
the ATCQ, with the addition of 5% FCS (Hyclone), 1 00 mM non essential amino acids 
(Gibco), 1 mM sodium pyruvate (Gibco). mercaptoethanol 5.5 x 10"^ M (Gibco), 10 mM 
Hepes (Gibco). RNA was either prepared fiom resting cells or cells activated with PMA at 10 
ng/ml and ionomycm at 1 jig/ml for 6 and 14 hours. Keratinocyte line CCD106 and an airway 
epitheUal tumor lineNCI-H292werealsoobtainedfiom the ATCC. Both were cultured in 
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DMEM 5% FCS (Hydone), 100 [iM non essential amino acids (Gibco), 1 mM sodium 
pyruvate (Gibco), mercaptoethanol 5.5 x 10"^ M (Gibco), and 10 mM Hepes (Gibco). 
CCD1106 cells were activated for 6 and 14 hours with s^jproximately 5 ng/ml TNF alpha and 
1 ngtol JL-l beta, while Na-H292 cells were activated for 6 and 14 hours with the following 
cytokines: 5 ng/ml EM, 5 ng/ml IL-9, 5 ngAnl IL-IS and 25 ng/ml IFN s«mm« 

For these cell lines and blood cells. RNA was prepared by lysing approximately lO' 
ceUs/ml using Trizol (Gibco BRL). Briefly, 1/10 volume of bromochloropropane (Molecular 
Research Corporation) was added to the RNA sample, vortexed and after 10 minutes at room 
tranperature, the tubes were spun at 14,000 rpm in a SorvaU SS34 rotor. The aqueous phase 
was removed and placed in a 15 ml Falcon Tube. An equal volume of isopropanol was added 
and left at -20 degrees C overnight The precipitated RNA was spun down at 9,000 rpm for 
1 5 min in a SorvaU SS34 rotor and washed in 70% ethanol. The pellet was redissolved in 300 
Ml of RNAse^fiee water and 35 fxl buffer (Promega) 5 ^1 DTT, 7 pi RNAsin and 8 nl DNAse 
wereadded. The tube was incubated at 37 degrees C for 30 minutes to remove contaminating 
genomic DNA, extracted once with phenol chloroform and re-precipitated with 1/10 vohmie 
of 3 M sodium acetate and 2 volmnes of 1 00% ethanol. The RNA was spun down and placed 
in RNAseftee water. RNA was stored at -«0 degrees C. 

Methods: 

The quantitative expression of various clones was assessed using microtiter plates 
contaimng RNA samples ftom a variety of normal and pathology-derived cells. ceU Imes and 
tissues using real time quantitative PGR (RTQ PGR; TAQMAN®). RTQ PGR was performed 
on a Perkin-Ehner Biosystems ABI PRISM® 7700 Sequence Detection System. Various 
coUections of samples are assembled on the plates, and referred to as Panel 1 (containing cells 
and cell Knes fiom normal and cancer sources). Panel 2 (containing samples derived flom 
tissues, in particular from surgical samples, from nomial and cancer sources). Panel 3 
(containing samples derived fiom a wide variety of cancer sources) and Panel 3 (containing 
cells and ceU hnes from normal cells and cells related to mflammatory conditions). 

First, the RNA samples were normalized to constitutively expressed genes such as p- 
actin and GAPDH. RNA (-50 ng total or ~1 ng polyA+) was converted to cDNA using the 
TAQMAN® Reverse Transcription Reagents Kit (PE Biosystems. Foster City, CA; Catalog 
No. N808-0234) and random hexamers according to the manufecturer's protocol. Reactions 
were performed in 20 ul and incubated for 30 min. at 48»C. cDNA (5 ul) was then transferred 
to a separate plate for the TAQMAN® reaction using p-actin and GAPDH TAQMAN® 

166 



wo 01/62928 PCT/USOl/06151 

Assay Reagents (PE Biosystems; Catalog Nos. 43 10881E and 4310884E, respectively) and 
TAQMAN® universal PGR Master Mix (PE Biosystems; Catalog No. 4304447) according to 
the manufecturer's protocol. Reactions were performed in 25 ul using the following 
parameters: 2 min. at 50°C; 10 min. at 95°C; 15 sec. at 95"C/1 min. at 60°C (40 cycles). 
Results were recorded as CT values (cycle at which a given sample crosses a threshold levfcl of 
fluorescence) using a log scale, with the difference in RNA concentration between a given 
sample and the sample with the lowest CT value being represented as 2 to the power of delta 
CT. He peaxjent relative expression is then obtained by taking the reciprocal of this RNA 
difference and multiplying by 100. The average CT values obtained for B-actin and GAPDH 
were used to normahze RNA samples. The RNA sample generating the highest CT value 
required no further diluting, while aU other samples were diluted relative to this sample 
accordmg to their P-actm /GAPDH average CT values. 

Normalized RNA (5 ul) was converted to cDNA and analyzed via TAQMAN® using 
One Step RT-PCR Master Mix Reagents (PE Biosystems; Catalog No. 4309169) and gene- 
specific primCTs according to the manufecturer's instructions. Probes and primers were 
designed for each assay according to Perkin Ehner Biosystem's Primer Express Software 
package (version I for Apple Computer's Madntosh Power PQ or a similar algorithm using 
the target sequence as input. Default settings were used for reaction conditions and the 
followmg parameters were set before selecting primers: primer concentration = 250 nM, 
primer melting tempaature (T„0 range = 58''-60» C, primer optimal Tm = 59" C, maximum 
primer difference = 2» C, probe does not have 5' G, probe T„ must be 10" C greater than 
primer T„„ ampUcon size 75 bp to 100 bp. The probes and primers selected (see betow) were 
synthesized by Synlhegen (Houston, TX, USA). Probes were double purified by HPLC to 
remove uncoupled dye and evaluated by mass spectroscopy to verify coupling of reporter and 
quencher dyes to the 5' and 3' ends of the probe, respectively. Their final concentrations 
wrae: forward and reverse primars, 900 nM each, and probe, 20QnM. 

The Taqman oUgonucleotide set Ag756 for NOV-1, NOV-2, and N0V-2b (i.e., 
10132038) mclude fbs forward probe and reverse ohgomers shown below: 



TABLE 36 



Primers 


Sequences 


TM 


ength 


Start 
osition 


Forward 


S'-GGAGCAGrrCCTCACITATCXS-S' (SEQ ID NO: 47) 


59 


21 


248 
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10 



15 



Probe 


T(MTaACCAGACO-CAAGAAACACrcG-3'-TAMRA 
(SEQ1DN0:48) 


68.6 


27 


272 


Reverse 


S'-CAGTTGCCAl'Cl-l-miX.TrCAT-y (SBQ JD NO: 49) 


592 


22 


304 



The Taqman oligonucleotide set Ag756 forNOV-3a through N0V-3d (ie., 18552586) 
include the forward probe and reverse oligomers shown below: 
TABLE 37 



Primers 


Sequences 


TM 


Length 


Start 
Position 


Forward 


5'-AATGCTGAGGTCAAGCTAGGT-3' (SEQ ID NO: 50) 


58.1 


21 


121 


Probe 


TET-5'-<JrCClTCTGAGGCTGACGAGGACCT-3'- 
TAMRA (SEQ ID NO: 5 1) 


69.3 


25 


149 


RevCTse 


5'-CATTCTCrGTrcTGGAGGTGAA-3' (SEO 
ID NO: 52) 


59.3 


22 


174 



The Taqman oKgonucleotide set Ag756 for N0V-4a, N0V-4b, NOV-4c, N0V-4d, and 
NOV-4e (i.e., 10093872) include the fi>rward probe and reverse oligomers shown below: 
TABLE 38 



Primer 


Sequences 


Length 


Forward 


5'-GGACTCCTCGGGATGGAAAG-3' (SEQ ID NO- 
53) 


20 


Probe 


FAM-5'-CGGCCTTGGTCTCGGAGATCCC-3'- 
TAMRA (SEQ ID NO: 54) 


23 


Reverse 


5'.CTCCCCTGGTGCTGGAAA'lT-3' (SEQ ID NO: 
55) 


20 



PGR conditions: 

Normalized RNA &om each tissue and each cell line was spotted in each wefl of a 96 
weU PGR plate (Perfdn Ehner Biosystems). PGR cocktails including two probes (a probe 
specific for the target clone and another geue-spedfic probe multiplexed with the target probe) 
were set up using IX TaqMan™ PGR Master Mix for the PE Biosystems 7700, with 5 mM 
MgG12, dNTPs (dA, G. G, U at 1:1:1 :2 ratios). 0.25 U/ml AmpUTaq Gold™ (PE Biosystems), 
and 0.4 RNase inhibitor, and 0.25 U/jil reverse transcriptase. Reverse transcriptio 
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ion was 



"^001/6292^ PCT/DSOl/06151 
perfonned at 48^ C for 30 minutes foUowed by ampIification/PCR cycles as foUows: 95« C 10 
min, then 40 cycles of 95° C for 15 seconds, 60*^ C for 1 minute. 



TABLE 39: NOV-l, NOV^2, NOV-2b Taqman Results 

• Ihpanel 1 of the lesdts, the fono\raig abhreviations are iised: 



ca. = carcinoma, 

* — established &om metastasis, 

met = metastasis, 

scellvar = smaU cell vaiiant, 

^on-s =noii-sm =^ioii-small, 

squam = squamous, 

pLeff = pi effusion = pleural efiusioD, 

B^o = glioma, 

astro = astrocytoma, and 

neuro = neuroblastoma. 



• Ihpanel 2 of the results, tiie following abbreviations are used- 
C!ca; Colon Cancer 
PCa: Prostate Cancer 
Lea; Lung Cancer 
RCC: Renal Cell Carcinoma 
UtCa: Uterine Cancer 
ThyCa: Thyroid Cancer 
BrCa; Breast Cancer 
HCC; Hepatic Cell Carcinoma 
TCC; Tiansitional Cell Carcinoma of the bladder 
OvCa: Ovarian Cancer 
GaCa: Gastric Cancer 





Panel 1 
Run 1 
Run 2 




Panel 2 




Panel 3 




Tissue Name 


ag756 
%ReL 
Expn. 


g756 

%ReL 
Expn. 


Tlssue_Name 


ag75 

6% 

Rel. 

Exp 

n. 


Tissue_Name 


ag75 

6 

% 

Rel 
Expn 


Endothelial 
cells 


0.0 


0.0 


Nomnai Colon 


78.5 


93768_Secondary 

Thi anti-CD28/anti-CD3 


0 


Endothelial 
cells (treated) 




54.7 


CCa1 


1.0 


93769_Secondary 
Th2 antl-CD28/anti-CD3 


0 


Pancreas 


27.6 


5.4 


CCa1 
Margin 


7.9 


93770_Seoondary 
Tr1_anti-CD28/anti-CD3 


0 


Pancreatic 
ca.CAPAN 2 


0.0 


0.0 


CCa2 


3.7 


93573_Secondary 
Th1_resting day 4-6 In IL-2 


0 


Adrenal Gland 
(new tot*) 


9.3 


29.3 


CGa2 
Margin 


15,2 


93572_Secondary 
Th2_restlng day 4-6 In IL-2 


0 


Thyroid 


8.0 


6.5 


CCa3 


0.4 


93571„Secondary 
Tr1_restlng day 4-6 in IL-2 


0 
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Salavary gland 



Pituitary gland 



6.8 



3^ 



19.9 



CCa3 
Margin 



93568_priniary Thijann- 
35.6 I CP28/anti-CD3 I o 



7.8 



CCa4 
CCa4 



10.1 



93569_prinnafy Th2_antj- 
CD28/anti-CD3 



Brain (fetal) 



3.4 



18.4 I Margin 



11.6 



93570 j)rinriary Tr1_anti- 
CD28/antl-CD3 



Brain (whole) 



6.9 



27.4 



CCa5 
Metastasis 



7.2 



93565 jrinnary Th1_resting 
dy4-6in IL-2 | o 



Brain 

(amygdala) 
Brain 



2.5 



13.8 



CCa5 
Margin 
(Liver) 



52.9 



93666j3rfmary Th2_resting 
dy4-6ln iL-2 



(cerebellum) 



2.0 



28.7 CCa6 



2.5 



93567_j)rimary Tr1_resting 
dy 4-6 In IL-2 j o 



Brain 

(hippocampus) 



3.8 



20.9 



CCa6 r 93351_CD45RACD4 

l^argin lymphocyte_anti- 
(Lung) I 14.1 CD28/anti-CD3 



Brain 

(thalamus) 



3.0 



11.0 



Normal 
Prostate 



93352_CD45RO CD4 
lymphocyte_antl" 
10.0 I CD28/antl-CD3 



Cerebral 
Cortex 



7.0 



61.1 PCal 



93251_CD8 
Lymphocytes^antl- 
10.7 I CD28/anti-CD3 



Spinal cord 



8.6 



27.0 PCa 1 Margin 



93353_chronic CD8 
Lymphocytes 2ry_resting 
37.6 I dy 4-6 in IL-2 



CNS 

ca.(glio/astro) 
U87-MG 



CNS 

ca.(glio/astro)U 
-118-MG 



0.0 



0.0 



PCa 2 



93574_chronlc CD8 
100. I Lymphocytes 2ry_activated 
0 I CD3/CD28 I 0 



0.2 



CNS 

ca.(astro)SW1 
783 



0.0 PCa 2 Margin 89.5 | 93354 CD4 none 



0.0 



93252_Secondary 

^ ^ , , , Th1/Th2n'r1_antl-CD95 
0 0 I NonnalLung 51.1 CH11 | q 



CNS ca.* 
(neuro; met 
)SfCN-AS 



CNS ca. 
(astro)SF-539 



CNS ca. 
(a$tro)SNB-75 



0.0 



0.0 



LCal 
Metastasis 



0.1 



0.0 



LCa 1 Margin 
(Muscle) 



10 93103_LAK cells resting | 0 
11.3 I 93788 LAK cellsJL-2 I 0 



0.3 



0.3 



, 93787_LAK cells IL-2+IL- 
LCa2 I 8.5 12 



CNS I 
(gIio)SNB-19 



0.1 



CNS ca. 
(glio)U251 



0.0 LCa2Manain 



93789_LAK cellsJL-2+IFN 
31.6 I gamm a | o 



0.0 



0.0 



, 93790_LAK cells IL-2+IL- 
LCa3 I 5.8 | 18 ~ iq 



CNS ca. 
(glio)SF-295 



0.0 



0-0 LCa 3 Margin 



931 04_LAK 

cells_PMA/ionomycin and 
28.3 I IL-18 



Heart 



Skeletal 
Muscle (new 
lot*) 



28.5 



77.9 LCa4 



93678_NK Cells IL- 
1.6 I 2_resting 



Bone manrow 



Thymus 



16.3 



15.7 LCa 5 



! 93109_Mixed Lymphocyte 
I 4.2 I Reaction_Two Way MLR | 0 



0.7 



0.9 LCa 5 Margin 



1.1 



2.7 



Ocular 
Melanoma 
Metastasis 



93110_Mixed Lymphocyte 
29.5 I Reaction Two Way MLR | 0 



, 93111_Mixed Lymphocyte 
15.9 Reaction_^Two Way MLR | 0 



Spleen 



0.9 



2.1 



Ocular 

Melanoma 

Margin 



170 



931 12_Mononuclear Cells 
38.7 I (PBMCs)_resting | o 
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Lymph node | 3.6 



931 13_MononucIear Cells 
0.0 I (PBMCsLPWM 



931 14_Mononuclear Cells 
32,3 I (PBMCs ) PHA-L 



93249_Ramos (B 
56.3 I cel!)_none 



Small intesti ne | 11.7 
Colon 



71.2 



93250_Ramos (B 
cell)Jonomycin 



93349_B 
26.1 I lymphocytes PWM 



63.7 



93350_B 

I l>^phoytes_CD40L and IL- 
4 



Colon ca.HT29 0.0 



28.3 



92665_EOL-1 

(EosinophilLdbcAMP 

differentiated 



93248_EOL-1 
(EosinophilLdbcAMP/PmA 



37.1 iononriydn 



93356_Dendritic 
33.7 I Cells none 



83219 CC Well 



93355_Dendritic Cells_LPS 
5.7 I lOQng/ml 



Colon ca.HCC- 



93775_Dendrltic 
'18.2 I Cells__anthCD40 



8.5 93774„Monocytes resting 



93776„Monocytes_LPS 50 
5.5 I ng/ml 



93581_Macrophages_resti 
1.0 I ng 



93582_Macrophages LPS 
18.7 I 100 ng/ml 



93098_HUVEC 
6.0 I (Endothelial) none 



93099_HUVEC 
S.5 I (Endothelial) starved 



Renal ca JV498 0.2 



93100_HUVEC 
0.3 I (Endothelial) IL-1b 



93779_HUVEC 
14.1 (EndotheliaPJFN gamma 



6.3 



93102_HUVEC 
(Endothelial)_TNF alpha + 
IFN gamma 



0.1 



93101_HUVEC 
(Endothellal)_TNF alpha + 
19.6 I IL4 



0.0 



93781_HUVEC 
7.0 I (Endothelial) JL-11 



0.0 



93583_Lung Microvascular 
46.0 I Endothelial Cells none 



0.0 



93584_Lung Microvascular 
Endothelial Cells TNFa (4 
6.1 I ng/ml) and IL1 b (i ng/ml) 



37.1 



92662_Microvascular 
6.1 I Dermal endothelium none 



12.9 



81.2 
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Hep62 














Lung 


7.4 


25.4 


ThyCa2 


0.8 


92663_MicrDsvasular 

L/61 1 1 idl CI luUU iciium 1 i^ira 

(4ng/ml)and IL1b(1 
no/mh 


OO.f 


Lung (fetal) 


9.6 


19.0 


ThyCa2 
Margin 


22.5 


93773_Bronchial 
ep(theh'um_TNFa (4 ng/ml) 
and IL1 b M na/mH ** 


u.u 


Lung ca. (small 
cell) LX-1 


0.0 


0.0 


Nonmal 
Breast 


12.1 


93347„Small Ainway 
EnithpHum nnriA 


n n 
u.u 


Lung ca. (small 
cell) NCI-H69 


0.0 


0.0 


BrCa 1 


7.5 


93348_SmaII Airway 
Epithelium_TNFa (4 ng/ml) 

And II 1 h (i nn/ml\ 


U.o 


Lung ca. (s.cell 
var.) SHP-77 


0.0 


0.0 


BrCa2 


4 0 


92668_Coronefy Artery 

viviw 1 C9UI ly 


U.D 


Lung ca. (large 
cell)NCI-H460 


0.1 


0.0 


BrCa 3 
Metastasis 


15.1 


92669_Coronery Artery 
oiviv>___ 1 iNrci yt ng/ml ) ano 
ILIbd ng/ml) 


0.1 


Lung ca. (non- 
sm. cell)A549 


0.2 


0.0 


BrCa 4 
Metastasis 


18.4 


93107 astrocytes resting 


20.0 


Lung ca. (non- 
s.cell) NCI-H23 


0.4 


2.4 


BrCa 5 


11.7 


iuo_ooiruuyio5^ i iMra 

ng/ml) and IL1b(1 ng/ml) 


17.0 


Lung ca (non- 
s.cell) HOP-62 


1.3 


0.8 


BrCa 6 


^ 1 


92666_KU-«12 


0.0 


Lung ca. (non- 
s.cl) NCI-H522 


100.0 


100.0 


BrCa 6 
Margin 


5.3 


92667_KU-812 
(Basophil) PMA/ionoycin 


0.0 


Lung ca. 
(squam.) SW 
900 


0.7 


0.8 


BrCa 7 


68 


93579_CCD1106 
\f\t7] dill luuy uto / none 


0.0 


Lung ca. 
(squam.) NCI- 
H596 


0.0 


0.0 


BrCa 7 
Marain 

Mil ■ 


11 7 


93580_CCD1106 
(Kerab'nocytes) TNFaand 


4.1 


Mammary 
gland 


5.0 


7.5 


Nonnal Liver 


37.1 


93791 Liver Cinrtiosls 


8.1 


Breast ca.* (pi. 

effusion) 

MCF-7 


2.7 


12.0 


HCC 1 


47.0 


^yjitj^ L.upuo rxiuney 


lo.l 


Breast ca.* 
(pI.eO MDA- 
MB-231 


0.0 


0.0 


HCC2 


34.2 


93577 NCI-H292 


1.9 


Breast ca.* (pi. 
effusion) T47D 


0.2 


0.0 


HCC 3 


52 


93358 NCI-H292 IL-4 


4.6 


Breast ca.BT- 
549 


0.0 


0.0 


HCC 4 


27.6 


93360 NCI-H292 IL-9 


0.9 


Breast ca. 
MDA-N 


0.0 


0.0 


HCC 4 
Margin 


3.6 


93359_NCI-H292 IL-13 


1.7 


Ovary 


5.5 


18.4 


HCC 5 


5.3 


gamma 


4.6 


Ovarian 
ca.OVCAR-3 


11.8 


21.2 


HCC 5 
Margin 


15.9 


93777 HPAEC - 


0.0 


Ovarian 
ca.OVCAR-4 


4.6 


12.5 


Normal 
Bladder 


27.0 


93778_HPAEC_IL-1 


U.U 


Ovarian 
ca.OVCAR-5 


0.2 


0.0 


TCC1 


2.1 


93254_Noniial Human 

Luntl Fnhrnhlf)<it nnno 


u.o 


Ovarian 
ca.OVCAR-6 


4.5 


21.5 


TCC2 


1.1 


93253_Normal Human 
Lung Fibroblast TNFa (4 
ng/ml) and IL-1b(1 ng/ml) 


0.8 


Ovarian 
ca.lGROV-1 


4.3 


5.4 


TCC3 


2.1 


93257_Nonmal Human 
Lung Fibroblast IL-4 


0.5 


Ovarian ca.* 


50.0 


92.7 


TCC3 


52.1 


93256_NonDaI Human 


0.3 
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(ascites) SK- 
OV-3 






Margin 




Lung FibrodlastJL-9 




Uterus 


7.6 


24.2 


Normal 
Ovary 


7.7 


93255_Normal Human 
Lung Fibroblast IL-13 


1 7 

1 .f 


Plancenta 


17.0 


31.4 


OvCa1 


B9 5 


93258_Nonmal Human 
Lung FibroblastJFN 
gamma 


in 0 


Prostate 


5.3 


15.5 


OvCa2 


45.1 


93106„DemiaI Fibroblasts 
CCD1070 resting 


0.0 


Prostate ca* 
{bone met)PC- 
3 


14.7 


42.6 


OvCa2 
Margin 


8.7 


93361 Demnal Fibrobla<%t<s 
CCD1070_TNF alpha 4 
ng/ml 


u.o 


Testis 


10.2 


13.1 


Normal 
Stomach 


25.7 


93105 Dennal Fibroblasts 
CCD1070JL-1 beta 1 
ng/mi 


u.u 


ivielanoma 
Hs688(A).T 


0.0 


0.0 


Nonmal 
Stomach 


15.6 


93772_dermal 
fibroblast IFN aamma 


u.o 


l\4elanoma* 
(met) 

Hs688(B).T 


0.1 


0.0 


GaCal 


26.6 


93771 dermal 
fibroblast^lL^ 


0.5 


l\4elanomaUAC 
C-62 


0.2 


0.0 


GaCa1 
Margin 


31.0 


93259JBD Colitis 1** 


19.3 


l\4e!anoma 
1^14 


0.0 


0.0 


6aCa2 


15.4 


93260 IBD Colitis 2 


6.1 


Melanoma 
LOX IMVI 


0.0 


0.0 


oaoa 
Mai^in 


5.2 


93261JBD Crohns 


3.7 


Melanoma* 

(met)SK-MEL- 

5 

Adipose 


0.1 
2.5 


29.9 


oaL^a o 


13.7 


735010 Colon nomial 


26.1 












73501 9_Lung none 

64028-1 Thymus none 
64030-1_Kidnev none 


90.1 
100. 
0 

16.0 



lABI^: NOV-3a,NOV-3b,NOV-3cTaqmjmResiiMs 





Panel 
1 




Panel 
2 




Panel 
3 


tissue^Name 


Ag66 
4 

%Rel. 
Expn. 


Tlssue_Name 


ag664 
%Rel. 
Expn. J 


TIssue.Name 


ag664 
%Rel. 
Expn. 


Liver adenocarcinoma 


13.6 


Nomnal Colon 


70.2 


93768_SecondaryTh1 anti- 
CD28/anti-CD3 


16.4 


Heart (fetal) 


6.5 


CCal 


227 


93769 Secondary Th2 anti- 
CD2e/anti-CD3 


12.9 


Pancreas 


6.4 


CCa 1 Margin 


9.0 


93770 Secondary Tr1 antl- 
CD28/anti-CD3 


18.3 


Pancreatic ca. 
CAPAN2 


1.6 


CCa2 


14.0 


y3t)/iJ_SecondaryTh1 resting day 
4-6inlL-2 


22.1 


Adronal gland 


10.5 


CCa 2 Margin 


6.5 


93572_SecondaryTh2 resting day 
4-6inlL-2 


13.1 


TTiyroid 

Salivary gland : 


5.6 

4.8 1 


CCa 3 

E:;c;a 3 Margin : 


CO CM 

sis i 


93571_SecondaryTr1 resting day 
4-6lnlL-2 

93568_j)rimaryTli1 anti- 


23.0 
11.5 
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[CD2a/antt-CD3 



Pituitary gland 



14.3 CCa4 



93569j)rimary Th2_anti- 
27.6 |CD28/anti-CD3 



15.9 



Brain (fetal) 



27.6 IcCa 4 Margin 



Brain (whole) 



CCa5 
22.5 |l\^etastasis 



93570_primary Tr1_anB-CD28/anti- 
10.2 |CD3 



Brain (amygdala) 



22.7 



CCa 5 Margin 
(Liver) 



93565_primaryTh1 resting dy 4-6 
38.4 |inlL-2 



16.6 



93566jDrimaryTh2 resting dy 4-6 
7.3 inlL-2 



73.7 



47.0 



Brain (cerebellum) 



13.0 CCa 6 



93567_primary Tri resting dy 4-6 
34.4 lnlL-2 



26.4 



Brain (hippocampus) 



100.0 



CCa 6 Margin 
(Lung) 



93351_CD45RACD4 
5-9 iymphocyte_anti-CD28/anti-CD3 



Brain (thalamus) 



22.4 Normal Prostate 



93352_CD45RO CD4 
20.7 I lymphocyte jnti-CD28/anti-CD3 



8.5 



Cerebral Cortex 



24.3 PCal 



y3251_CD8 Lymphocytes anti- 
26.6 |CD28/anti-CD3 



19.3 



Spinal cord 



22.7 |PCa1 Margin 



32.8 



[93353_chronjc CDS Lymphocytes 
2ry_resting dy 4-6 in IL-2 



8.0 



9.9 



gllQ/astro U87-MG 
glio/astro U-118-MG 



2.8 PCa 2 



22.7 IPCa 2 Margin 



93574_chronlc CDS Lymphocytes 
47.3 2ry_activated CD3/CD28 



36.9 193354 CD4 none 



6.7 



17.4 



astro SW1 783 



neuro; metSK-N-AS 



astro SF-539 



5.4 Nonmal Lung 



26.8 jLCa 1 Metastasis 



93252_Secondary 
100.0 |ThiyTh2/rr1 anti-CD95 CH11 



12.8 



LCa 1 Margin 
l(Muscle) 
LCa 2 



12.5 [93103 LAK cells resting" 



20.7 



20.5 



8 193788 LAK cells IL-2 



19.3 



glioSNB-19 



giio U251 



5.A_ 
.4 



[LCa 2 Margin 



24.2 193/87 LAK cells IL-2+IL-12 



kO LCa 3 



40.9 aa/gg lak cells. IL-2-HFN gamma 



6.8 



13.6 193790 LAK cells IL-2TlL::Tft 



16.0 



24.2 



glio SF-295 



4.5 



Heart 



2.4 



|LCa 3 Margin 
|LCa4 



y3l04_LAK celis^PMA/ionomycIn 
7.8 and IL-18 



10.4 [93578 NK Cells IL-2 "rating" 



Skeletal muscle 



0.9 LCa 5 



»3i09_Mlxed Lymphocyte 
32.3 |Reaction_Tw o Way M LR 



1.5 



18.7 



23.7 



Bone marrow 



17.0 LCa 5 Margin 



Thymus 



20.3 



Ocular 
Melanoma 
Metastasis 



a3iio_Mixed Lymphocyte 
12.1 Reaction Two Way MLR 



5.8 



6.8 



Spleen 



25.4 



Ocular 
Melanoma 
Margin (Liver) 



931 1 1_Mixed Lymphocyte 
Reaction Two Way MLR 



931 12_Mononuclear Cells 
8.0 |(PBMCs)„re$ting 



10.2 



8.6 



Lymph node 



29.9 



Melanoma 
Metastasis 



Colorectal 



Stomach 



5.9 



24.8 



Melanoma 
Margin (Lung) 
Normal Kidney 



18.2 
16.4 



931 1 3_MononucIear Ceils 
|(PBMCs)JWM 



931 14_Mononuclear Cells 
|(PBMCs)_PHA-L 



24.5 



40.9 93249_Ramos (B celi)_none 



18.6 



5.2 



14.4 IRCCI 



32.8 SJ3250^Ramos (B celQJonomycin 



17.8 



Cdon SW620(SW480 

^ 



4.9 IRCC 1 Margin 



RCC2 



30.6 
63.3 



93349„B lymphocytes PWM 



93360_B lymphoytes CD40Land 
IL-4 



262 



30.6 



Colon HT29 



6 RCC 2 Margin 



92665_EOL-1 
9-7 (Eosinophn)_dbcAMP differentiated 



9.7 



Colon HCT-116 



Colon CaCo-2 



3.2 RCC 3 



Colon Ca 
tissue(OD03866) 



7 IRCC 3 Margin 



31.2 



93248_EOL-1 
(Eosinophil)_dbcAMP/PmA[onomyc 



18.6 



93356 Dendritic Cells none 



18.7 RCC 4 



4.5 



93355_Dendrltic Cells^LPS 100 
ng/ml 



22.2 



12.1 



20.0 
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Colon HCC-2998 



32.6 



GastricOivermet)NCI-| 
N87 lii.o 



RCC 4 Margin |12.2 



RCC5 



11.6 



93775_Dendrftlc Cells ant^CD40 



93774_Monocytes„resting 



17.7 



22.4 



Bladder 



Trachea 



35.6 



RCC 5 Margin |3.9 



RCC 6 



15.8 



93776_Monocytes_l-PS 50 ng/ml 



93581_Macrophages_ resting 



2.0 



Kidney 



6.4 



Kidney (fetal) 



13.7 



RCC 6 Margin |l4.6 



RCC 7 



93582_Macit)phages_l-PS 100 
ng/ml 



11.8 



92 



93098_HUVEC (Endothelial) none 



3.9 



Renal 786-0 



0.0 



Renal A498 



20.7 



RCC 7 Margin |5.8 



RCC 8 



20.9 



93099_HUVEC 
(Endoihelfal) starred 



6.0 



93100_HUVEC (Endothelial) lUlb 



15.0 



5.3 



Renal RXF393 



1.5 



RCC 8 Margin 110.7 



93779„HUVEC (Endothelial)JFN 
gamma 



132 



Renal ACHN 



1.8 



RCC 9 



23.0 



93102„HUVEC (Endothelial)_,TNF 
alpha IFN gamma 



^enal UO-31 



1.6 



Renal TK-10 
Liver 



|2X 



RCC 9 Margin I21.O 



Nomial Uterus ISid' 



93101_HUVEC (Endothelial)„TNF 
alpha + IL4 



9.0 



937ei_HUVEC (Endotheiial) IL-11 



5.3 



10.5 



UtCal 



31.9 



93583_Lung Microvascular 
Endothelial Cells none 



5.5 



Liver (fetal) 
Liver (hepatoblast) 



21.8 



Nomnal Thyroid 13.8 



93584_Lung Microvascular 
Endothelial Ceiis_TNFa (4 ng/ml) 
and ILIbd ng/ml) 



7.3 



6.5 



HepG2 



14.1 



ThyCal 



6.3 



92662_Microvascular Denral 
endothelium none 



8.0 



Lung 



48.6 



ThyCa2 



7.9 



92663_Mlcrosvasular Dennal 
endothelium_TNFa (4 ng/ml) and 
IL1b(1 ng/ml) 



Lung (fetal) 



24.3 



ThyCa 2 Margin 17.0 



93773_Bronchlal epithelium^TNFa 
:4 ng/ml) and ILIb (1 ng/ml) ** 



10.2 



12.2 



[Lung (small cell)LX-1 4.7 



Normal Breast 36.9 



93347_Smali Ainvay 
Epitheliumjone 



Lung (small cell) NCI- 
|H69 



BrCal 



10.7 



93348_SmaII Ainvay 
Epithelium^TNFa (4 ng/ml) and 
IL1b(1 ng/ml) 



32.5 



Lung (s.cell var.) 



SHP-77 



24.5 



BrCa2 



112 



92668_Coronery Artery 
SMC_restlng 



Lung (large cell)NCI- 
|H460 (1.6 



Lung (non-sm. cell) 
A549 



BrCa3 
Metastasis 



32.8 



1.4 



Lung (non-s.cell) NCI- 
H23 I10.7 
Lung (non-s.cell) 

HOP-62 |32.3 

Lung (non-s.cl) NCI- 
|H522 |i.7 



BrCa4 
Metastasis 



92669_Coronery Artery 
SMC_TNFa (4 ng/ml) and ILIb (1 
ng/ml) 



O.O 



13.7 



931 07_astrocytes resting 



BrCaS 



19.8 



93108_astrocytes_TNFa (4 ng/ml) 
and ILIbd ng/ml) 



42 



BrCa6 



29.1 



92666_KU-812 (Basophil) resting 



BrCa 6 Margin 172 



92667_KU-812 
;Basophll)_PMA/ionoycin 



72 



Lung (squam.) SW 

900 

Lung (squam.) NCI- 
H596 



Mammary gland 
(Breast (pI.eQ MCF-7 



3.6 



BrCa7 



13.9 



0.9 



272 



BrCa 7 Margin 125.5 



Nonnal Liver fsis" 



93579_CCD1106 
Keratinocytes)_none 



93580_CCD1106 
(Keratinocytes)_TNFa and IFNg ** 



93791_Liver Cinliosis 



12.4 



iCCi 



14.1 



93792_mpus Kidney 



Breast (pl.ef)MDA- 
MB-231 



18.7 



Breast (pl.ef)T47D 
Breast BT-549 



HCC2 



0.4 



14.5 



HCC3 



93577 NCI-H292 



7.9 



93358_NCI-H292 IL-4 



28.9 



Breast MDA-N 
lovary 



HCC4 



20. 



16.0 



122 



feu 4 Margin 1 19.3 



93360__NCI-H292 IL-9 



93359 NCI-H292 IL-13 



HCC5 



93357_NCI-H292JFN gamma 



.9 
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Ovarian OVCAR-3 


2.1 


HOC 5 Margin 


2.0 


93777 HPAEC - 


6.5 


Ovarian OVCAR-4 


0.5 


Normal Bladripr 


Ad A 


93778_HPAEC_IL-1 betefTNA 


9.7 


Ovarian OVCAR-5 


0.6 


TCC 1 




93254_Non7iaI Human Lung 
r loruDiasi^none 


2.2 


Ovarian OVCAR-8 


6.5 


TCC2 


16.4 


93253_Nomf)al Human Lung 
rioroDjasi J NrB (4 ng/Ttii) and IL- 
1b (1 ng/ml) 


3.0 


Ovarian IGROV-1 


3.5 


TCC 3 


22.7 


yo^o/_iMormai numan Lung 
Fibroblast IL-4 


4.0 


Ovarian (ascites) SK- 
OV-3 


2.3 


TCC 3 Margin 


13.4 


vo£.oo Noniiai Muman Lung 
fibroblast IL-9 


3.2 


Uterus 


172 


Nonma! Ovary 


12.7 


^ozoo Normal Human Lung 
Fibroblast IL-13 


4.8 


Plancenta 


12.9 


OvCal 


23.3 


mzw_Norfr\3\ Human Lung 
Fibroblast IFN gamma 


4.0 


Prostate 


8.5 


OvCa2 


72.2 


i^oiuo Dermal Fibroblasts 
CCD1070 resting 


1.5 


Prostate (bone 
met)PC-3 


3.3 


OvCa 2 Margin 


4.1 


yoooi_ijermai riDroblasts 
CCD1070_^TNF alpha 4 ng/rril 


42.6 


Testis 


4.1 


Nonmaf Stomach 


20.2 


93105_Dennal Fibroblasts 
CCD1070 IL-1 beta 1 ng/ml 


7.3 


Melanoma 
Hs688(A).T 


0.6 


Nonnal Stomach 


5.2 


93772_dermal fibroblastJFN 
gamma 


4.3 


[Melanoma (met) 
Hs688(B).T 
IWeianoma UACC-62 
Melanoma M14 


0.5 
1.6 
0.6 


GaCa 1 

GaCa 1 Margin 
GaCa2 


8.4 

15.9 

38.4 


93771_dermal fibroblast IL-4 
93259 IBD Colitis 1** 


10.4 
3.2 


Melanoma LOX IMVI 
Melanoma (met) SK- 
MEL-5 ( 
Adipose i 


1.7 

5.2 i 
5.0 


GaCa 2 Margin 
SaCa3 


4.5 

55.5 ■ 

( 
( 


93260 IBD Colitis 2 
^261 IBDCrohns 

735010 Colon normal ; 
735019 LungLnone ( 
34028-1 Thymus none : 
S4030-1 Kidney none 


0.0 

n n 
u.u 

23.0 
S.4 
21.2 
100.0 



TABLE 41 : NOV-4a, NOV^b, NOV.4c NOV-4d, and NOV-4e Taqman results 





Panel 1 




Panel 2 


Tissue_Name 


ag538 
%ReL 
expn. 


Tis5ue_Name 


ag538 
% Rel. 
expn. 


Adiposd 


12.6 


Normal Colon GENPAK 
061003 


9.7 


Adrenal gland 


19.9 


83219 CC Weil to Mod Drff 
(OD03866) 


4.3 


Bladder 


100.0 


83220 CC NAT {OD03866) 


3.3 . 


Bone mannow 


4.8 


83221 CC Gr.2 rectosigmoid 
(0D03868) 


2.9 


EndothelJal cells ~ 


0.0 


83222 CC NAT (OD03868) 


2.1 


Endothelial cells 
(treated) 


4.5 


83235 CC Mod DifP 
(ODO3920) 


8.0 


Liver 


9.3 


83236 CC NAT {ODO3920) 


4.6 


Liver (fetal) 


4.1 


83237 CC Gr.2 ascend colon 
(OD03921) 


3.4 


Spleen 


4.4 


83238 CC NAT (OD03921) 


2.4 


Thymus 


2.3 


83241 CC from Partial 
Hepatectomy (ODO4309) 


2.8 
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Thyroid 



14.0 



Trachea 



7.6 



Testis 



10.4 



Spinal cord 



8.7 



Salavary gland 



13.7 



Brain (amygda)a) 



02 



Brain (cerebellum) 



0.8 



Brain (hippocampus) i2 



Brain (substantia nigra) I 7.9 



Brain (thalamus) 



i2 



Cerebral Cortex 



1.0 



Brain (whole) 



0.4 



Brain (fetal) | o.l 

CNSca. (glio/astro) 
U-118-MG I 13 



CNS( 



> ca. (astro)SF"539 
CNS ca. (astro) SNB- 



0.4 



75 



1.0 



CNS ca. (astro) 
SW1783 



4.7 



CNSca.(giio)U251 I 0.0 



CNSca.(glio)SF-295 I 2.1 
CNSca.(glio)SNB-19 I O.O" 



CNSca. 
(gliQ/astro)U87-MG 



CNS ca.* (neuro; met ) 
SK-N-AS 



0.0 



0.1 



Smalf intestine 



31.4 



Colorectal 



29.7 



Colon ca. HT29 



0.2 



Colon ca.CaCo-2 



0.0 



Colon ca.HCT-.15 



0.4 



Colon ca.HCT-1 16 



0.0 



Colon ca. HCC-2998 0.8 



Colon ca. SW480 | 0.3 
Colon ca.* (SW480 
met)SW620 | o.O 



Fetal Skeletal 



16.5 



83242 Liver NAT (QDO4309T 



87472 Colon mets to lung 
(OD04451-01) 



87473 Lung NAT (OD04461- 
02) 



Nonnal Prostate Clontech A+ 
6546-1 



84140 Prostate Cancer 
(OD04410) 



84141 Prostate NAT 
(OD04410) 



87073 Prostate Cancer 
(ODQ4720-01) 



87074 Prostate NAT 
(ODQ4720-02) 



Nonnal Lung GENPAK 
061010 



4.5 



7.0 



172 



6.2 



13.0 



100.0 



20.2 



6.0 



83239 Lung Met to Muscle 
(OD04286) 



83240 Musde NAT 
(OD04286) 



84136 Lung Malignant 
Cancer (OD03126) 



84137 Lung NAT (OD03126) 



84871 Lung Cancer 
(OD04404) 



84872 Lung NAT (OD04404) 



84875 Lung Cancer 
(OD04565) 



2.7 



0.5 



9.8 



2.0 



3.1 



2.0 



13.2 



85950 Lung Cancer 
(OD04237-01) 



85970 Lung NAT (OD04237- 
02) 



83255 Ocular Mel Met to 
Liver (QPO4310) 



83256 Liver NAT (ODQ4310) 



84139 Melanoma Mets to 
Lung (OD04321) 



84138 Lung NAT (OD04321) 



Nonmal Kidney GENPAK 
061008 



83786 Kidney Ca, Nuclear 
grade2(OD04338) 



83787 Kidney NAT 
(OD04338) 



83788 Kidney Ca Nuclear 
grade M2 (OD04339) 



83789 Kidney NAT 
(OD04339) 



83790 Kidney Ca, Clear cell 
type (QD04340) 



83791 Kidney NAT 
:OD04340) 



83792 Kidney Ca, Nuclear 
grade 3 (OD04348) 



83793 Kidney NAT 
I PD04348) 



87474 Kidney Cancer 
(OD04622-01) 



9.8 



42 



13.3 



0.7 



8.7 



12 



6.0 



7.5 



8.8 



16.5 



3.9 



6.9 



8.0 



8.8 



3.9 



13.3 



5.2 
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Skeletal muscle 


20.9 


87475 Kidney NAT 
(OD04622-03) 


9.1 


Heart 


33.9 


85973 Kidney Cancer 
(OD04450-01) 


4.4 


Stomach 


19.8 


85974 Kidney NAT 
(OD04450-03) 


11.3 


Gastric ca * (liver met) 
NCI-N87 


2.2 


Kidney Cancer Clontecti 
8120607 


2.1 


Kidney 


15.8 


Kidney NAT Clontech 
8120608 


5.0 


Kidney (fetal) 


8.1 


Kidney Cancer Clontech 
8120613 


0.1 


Renal ca. 786-0 


3.0 


Kidney NAT Clontech 
8120614 


3.6 


Renal ca. A498 


3.9 


Kidney Cancer Clontech 
9010320 


6.5 


Renal ca.ACHN 


97.3 


Kidney NAT Clontech 
9010321 


5.6 


Renal caTK-10 


0.4 


Nomnal Uterus 6ENPAK 
061018 


8.9 




Renal ca.UO-31 


10.4 


Uterus Cancer GENPAK 
064011 


6.1 




Renal ca. RXF 393 


6.4 


Normal Thyroid Clontech A-f 
6570-1** 


2.3 


Pancreas 


13.1 


Thyroid Cancer GENPAK 
064010 


1.0 


Pancreatic ca. CAPAN 
2 


0.1 


Thyroid Cancer 
INVITROGENA302152 


10J2 


Ovary 


23.8 


Thyroid NAT INVITROGEN 
A302153 


6.5 


Ovarian ca.IGROV-1 


0.0 


Normal Breast GENPAK 
061019 


8.1 


Ovarian ca.OVCAR-3 


26.6 


84877 Breast Cancer 
(OD04566) 


6.0 


Ovarian ca.OVCAR-4 


1.4 


85975 Breast Cancer 
(OD04590-01) 


8.0 


Ovarian ca.OVOAR-5 


3.4 


85976 Breast Cancer lOlets 
(OD04590-03) 


7.2 


Ovarian ca.OVCAR-8 


0.0 


87070 Breast Cancer 
Metastasis (OD04655-05) 


2.2 


Ovarian ca.* (ascites) 
SK-OV-3 


0.0 


GENPAK Breast Cancer 
064006 


19.2 


Prostate 


56.3 


Breast Cancer Clontedi 
9100266 


4.0 


Prostate ca * (t>one 
met)PC-3 


0.0 


Breast NAT Ciontech 
9100265 


6.6 


Plancenta 


66.0 


Breast Cancer INVITROGEN 
A209073 


4.7 




Pituitary giand 


4.5 


Breast NAT INVITROGEN 
A2090734 


9.0 




Uterus 


22.4 


Normal Liver GENPAK 
061009 


4.6 








Liver Cancer GENPAK 
064003 


1.1 








Liver Cancer Research 
Genetics Ri^ 1025 


4.5 








Liver Cancer Research 
Genetics RNA 1026 


4.6 
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Paired Liver Cancer Tissue 
Research Genetics RNA 
6004-T 


3.9 






Paired Liver Tissue Research 
Genetics RNA 6004-N 


3.6 






Paired Liver Cancer Tissue 
Research Genetics RNA 
6005-T 


5.4 






Paired Liver Tissue Research 
Genetics RNA6005-N 


5.1 






Nonnal Bladder GENPAK 
061001 


10:4 






Bladder Cancer Research 
Genetics RNA 1023 


5.7 






madder Cancer 

INVI 1 KOGEN A302173 


2.5 






87071 Bladder Cancer 
(0004718^1) 


4.9 






87072 Bladder Normal 
Adjacent (OD04718-03) 


11.4 






Normal 0>^ry Res. Gen. 
Ovarian Cancer GENPAK 
064008 


3.8 
19.1 






87492 Ovary Cancer 
(OEXW768^7) 


2.1 






87493 Ova/y NAT 
(OD04768-08) 


23.8 






Nonnal Stomach GENPAK 
061017 


12.3 






NAT Stomach Clontech 
9060359 


12.2 






Gastric Cancer Clontech 
9060395 


8.1 






NAT Stomach Clontech 
9060394 


18.3 






Gastric Cancer Clontech 


7.7 






NAT Stomach Clontech 
9060396 


8.5 






Gastric Cancer GENPAK 
064005 


15.4 
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NOVX 


Internal Accession 
Number 


Results 


NOV-1 


10132038.0.67 


Nonnal adjacent tissue to colon cancer tissue 
showed a higher ejqnression of tfae gene as CQnq>ared 
to colon cancer tissue itself . The results also 
demonstrate a similar profile for hing and ocular 
melanoma. 


NOV-2a 


10132038 0 139 


NOV-2b 


IV L DZAJ JO AJ, loo 


NOV-3a 


18552S86_EXT1 


High level of e^jxession in brain and moderate 
expression in hing and trachea, suggesting its 
potential role in diseases involving diese tissues. 
Mcieased egression in nomial colon as conspsaed 
to colon cancer tissue. Cancerous uterus and o vary 
tissues exhibited significatnly higher expression 
than their nonnal counterparts. 


NOV-3b 


18552586_EXT2 


N0V-3C 


18552586_EXT3 


N0V-3d 


18552586„EXT4 


NOV-4a 


10093872.0.107 


Increased eTqxression in normal bladder and 
moderate eiqsression in prostate, heart, placenta, 
small intestme, and colorectal cells. Nonnal 
adjacent tissue (NAT) of prostate showed mflYimnTri 
expression. 


N0V-4b 


10093872.1 


NOV-4C 


10093872.0.38 


NOV-4d 


100938722 


NOV-4e 


10093872.3 



OTBDER EMBODIMENTS 
While the invention has beoi described in conjunction with the detailed descriptioi 
fliereof, the foregoing description is intended to illustrate and not limit the scope of the 
invention, which is defined by the scope of the appmied claims. Other aspects, advantage 
and modifications are within ttie scope of the following claims. 
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What is claimed is: 

1 . An isolated polypq)tide comprising an amino acid sequence selected bom the 
group consisting of: 

a) a mature fonn of the amino add sequence selected from the group consisting of 
SEQ ID NO: 2, 4, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, or 23; 

b) a variant of a mature form of the amino add sequence selected fiom the group 
consisting of SEQ ID NO: 2, 4, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, or 23, wherein 
any amino add in ttie mature form is changed to a different amino add, 
provided that no more than 1 5 % of the amino add residues in the sequence of 
the mature form are so changed; 

c) the amino acid sequence selected &om the group consisting of SEQ ID NO: 2, 
4, 5, 7, 9, 11, 13, 15, 17, 19, 21, or 23; 

d) a variant of the amino acid sequence selected &ota the group consisting of SEQ 
ID NO: 2, 4, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, or 23 wherem any amino add 
specified in the chosen sequence is changed to a different amino acid, provided 
that no more than 15% of the amino acid residues in the sequence are so 
changed; and 

e) a fragment of any of a) through d). 

2. The polypeptide of claim 1 that is a naturally occurring allelic variant of the sequence 
sdected fiom the g^up consisting of SEQ ID NO: 2, 4, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 
or 23. 

3. The polypeptide of claim 2, wherein the variant is the translation of a single nucleotide 
polymorphism. 

4. The polypeptide of claim 1 that is a variant polypeptide described therein, wherein any 
amino acid specified in the chosen sequence is changed to provide a conservative 
substitution. 



5. 



An isolated nucldc acid molecule comprising a nucldc acid sequence encoding a 
polypeptide comprising an amino acid sequence selected &om the group consisting of: 
a) a mature form of the amino acid sequence given SEQ ID NO: 2, 4, 5, 7, 9, 1 1, 
13, 15, 17, 19, 21, or 23; 
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b) a variant of a mature form of the amino acid sequence selected from the group 
consisting of SBQ ID NO: 2. 4, 5, 7, 9, 11, 13, 15, 17, 19, 21, or 23 wherein 
any amino acid in the mature form of the chosen sequence is changed to a 
different amino add, provided that no more than 15% of the amino add 
residues in the sequence of the mature form are so changed; 

c) the amino add sequence selected &om the groiq) consisting of SBQ ID NO: 2, 
4, 5, 7, 9, 11, 13, 15, 17, 19, 21, or 23; 

d) a variant of the amino acid sequence selected from the group consisting of SEQ 
ID NO: 2, 4, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, or 23, in which any amino acid 
specified in the chosen sequence is changed to a different amino acid, provided 
that no more than 15% of the amino acid residues in the sequence are so 
changed; 

e) a nucleic acid fragment encoding at least a portion of a polypeptide comprising 
the amino acid sequence selected from the group consisting of SEQ ID NO: 2, 
4, 5, 7, 9, 11, 13, 15, 17, 19, 21, or 23 or any variant of said polypeptide 
wherem any amino acid of the chosen sequence is changed to a different amino 
add, provided that no more than 10% of the amino add residues in the 
sequence are so changed; and 

f) the complement of any of said nucldc acid molecules. 

The nucldc add molecule of claim 5, wherein the nucldc acid molecule comprises the 
nucleotide sequence of a naturally occurring allelic nucleic acid variant 

The nucldc acid molecule of claim 5 that encodes a variant polypeptide, wherem the 
variant polypeptide has flie polypeptide sequence of a naturally occurring polypeptide 
variant. 

The nucldc acid molecule of claim 5, wherein flie nucleic acid molecule comprises a 
single nucleotide polymorphism encoding said variant polypeptide. 

The nucldc acid molecule of claim 5, wherdn said nuddc acid molecule comprises a 
nucleotide sequence selected from the group consisting of 

a) the nucleotide sequence selected from the groi^j consisting of SEQ ED NO: 1, 
3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57; 
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b) a nucleotide sequence wherein one or more nucleotides in the nucleotide 
sequence selected 6om the group consisting of SEQ ID NO: 1, 3, 6, 8, 10, 12, 
14, 16, 1 8, 20, 22, or 57 is changed fiom (hat selected fiom the gfxyvtp 
consisting of the chosen sequence to a different nucleotide provided that no 
more flian 15% of the nucleotides are so changed; 

c) a nucleic acid fiagment of the sequence selected fix>m the group consisting of 
SEQ ID NO: 1, 3, 6. 8, 10, 12, 14, 16, 18, 20, 22, or 57; and 

d) a nucleic acid fragment wherem one or more nucleotides in the nucleotide 
sequence selected from the group consisting of SEQ ID NO: 1, 3, 6, 8, 10, 12, 
14, 16, 18, 20, 22, or 57 is changed from that selected fiom the grotq) 
consisting of Ihe chosen sequence to a different nucleotide providol that no 
more than 15% of flie nucleotides are so changed. 

10. The nucleic acid molecule of claim 5, wherein said nucleic acid molecule hybridizes 
undCT stringent conditions to the nucleotide sequence selected from the group 
consisting of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14. 16, 18, 20. 22, or 57, or a complement 
of said nucleotide sequence. 

11. The nucleic acid molecule of claim 5, wherein Ihe nucleic add molecule comprises a 
nucleotide sequence in which any nucleotide specified in the coding sequence of the 
chosen nucleotide sequaice is changed fiom that selected from the group consisting of 

the chosen sequence to a different nucleotide provided that no more than 15% of tiie 
nucleotides in the chosen coding sequence are so changed, an isolated second 
polynucleotide fliat is a conqjlement of the first polynucleotide, or a fiagment of any of 
them. 

12. A vector conaprising the nucleic add molecule of claim 1 1. 

13. The vector of claim 12, fiirther con^rising a promoter operably linked to saidnucldc 
add molecule. 

14. A cell conqnising the vector of claim 12. 



An antibody that bmds immunospecifically to the polypeptide of claim 1. 
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16. The antibody of claim 15, wherein said antibody is a monoclonal antibody. 

1 7. The antibody of claim 15, wherein the antibody is a humanized antibody. 

18. A method for determining the presence or amount of the polypeptide of claim 1 in a 
sample, the method comprising: 

(a) providing said sample; 

(b) introducing said sample to an antibody that binds immunospecifically to the 
polypeptide; and 

(c) determining the presence or amount of antibody bound to said polypeptide, 
thereby determining the presence or amount of polypeptide in said sample; 

19. A mettiod for determining the presence or amount of the nucleic acid molecule of 
claim 5 in a sample, the method comprising: 

(a) providing said sample; 

(b) introducing said sample to a probe that binds to said nucleic acid molecule; and 

(c) determining the presence or amount of said probe bound to said nucleic acid 
molecule, thereby determining the presence or amount of the nucleic acid 
molecule in said sample. 

0. A method of identifying an agent that binds to the polypeptide of claim 1, the method 
comprising: 

(a) introducing said polypeptide to said agent; and 

(b) determining whether said agent binds to said polypeptide. 

1. A method for identifying apotential therapeutic agent for use in treatment of a 
pathology, wherein the pathology is related to aberrant expression or aberrant 
physiological interactions of the polypeptide of claim 1, the method comprising: 

(a) providing a cell expressmg the polypeptide of claim 1 and having a properly or 
function ascribable to the polypeptide; 

(b) contacting the cell with a composition comprising a candidate substance; and 

(c) determining whether the substance alters the property or function ascribable to 
the polypeptide; 
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whereby, if an alteration observed in the presence of the substance is not observed when the 
ceU is contacted with a composition devoid of the substance, the substance is identified 
as a potential fbei^eutic agent 

22. A method for modulating the activity of the polypeptide of claim 1, the method 

conqjrising introducing a cell sample expressing the polypeptide of said claim with a 
con^und that binds to said polypq)tide in an amount sufBcient to modulate the 
activity of the polypq)tide. 



23. A method of treating or preventing a pathology associated with the polypeptide of 
claim 1, said mefliod comprising administering the polypfcptide of claim 1 to a subject 
in which such treatment or prevention is desired in an amount su£5cient to treat or 
prevent said pathology in said subject 

24. The method ofclaim 23, herein said subject is a human. 

25. A method of treating or preventing a pathology associated with the polypeptide of 
claim 1, said method conqjrising administraing to a subject m which such treatment or 
prevention is desired a NOVX nucleic add in an amount sufficient to treat or prevent 
said pathology in said subject 

26. The method of claim 25, wherein said subject is a human. 

27. A method of treating or preventing a pathology associated with the polypqitide of 
claim 1, said mefliod comprising administering to a subject in which such treatment or 
prevention is desired a NOVX antibody in an amount sufficient to treat or prevent said 
pathology in said subject 



28. The method of claim 27, wherein the subject is a human. 



A pharmaceutical composition comprising flie polypeptide of claim 1 and a 
pharmaceutically acceptable carrier. 

A pharmaceutical composition comprising flie nucleic acid molecule of claim 5 and 
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pharmaceutically accq)table carrier. 

31. A pharmaceutical composition comprising the antibody of claim 15 and a 
pharmaceutically acceptable carrier. 

32. A kit comprising in one or more containeis, the pharmaceutical composition of claim 
29. 

33. A kit comprising in one or more containers, the phamiaceutical composition of claim 
30. 

34. A kit comprising in one or more containers, tiie pharmaceutical composition of claim 
31. 

35. The use of a therapeutic in Hhe manufacture of a medicament for tieating a syndrome 
associated witii a human disease, tiie disease selected fiom a pathology associated with 
tiie polypeptide of claim 1, wherein said ther^eutic is the potypqptide of claim 1. 

36. The use of a ther^eutic in the manufacture of a medicament for beating a syndrome 
associated with a human disease, flie disease selected fixm apathology associated with 
flie polypeptide of claim 1, wherein said flierapeutic is a NOVX nucleic acid. 

37. The use of a ther^eutic in flie manufecture of a medicament for tieating a syndrome 
associated wifli a human disease, the disease selected from a pathology associated with 
the polypqptide of claim 1, wherein said thCTq)eutic is a NOVX antibody. 

38. A mefliod for screening for a modulator ofactivity or of latency or predisposition to a 
pafliology associated wifli the polypeptide of claim 1, said method comprising: 

a) administering a test compound to a test animal at increased risk for a pathology 
associated with tiie polypeptide of claim 1, wherein said test animal 
recombinantiy expresses flie polypeptide of claim 1; 

b) measuring flie activity of said polypeptide in said test animal after 
administering the con^und of step (a); and 

c) «»mparmg flie activity of said protein in said test animal wifli tiie activity of 
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said polypeptide in a control animal not administered said polypeptide, wherein 
a change in the activity of said polypeptide in said test animal relative to said 
control animal indicates the test compound is a modulator of latency of, or 
predispositiQn to, a pathology associated wilh the polypeptide of claim 1. 

39. The method of claim 38, wherem said test animal is a recombinant test anunal that 
expresses a test protein transgene or expresses said transgene under the control of a 
promoter at an increased level relative to a wild-type test anunal, and wherein said • 
promoter is not the native gene promoter of said transgene. 

40. A method for determining the presence of or predisposition to a disease associated with 
altered levels of the polypq)tide of claim 1 in a first mammahan subject, the iftethod 
comprising: 

a) measuring the level of expression of the polypeptide m a sample from the first 
mammalian subject; and 

b) comparing the amount of said polypeptide in the sample of step (a) to the 
amount of the polypeptide present m a control sample from a second 
mammaHan subject known not to have, or not to be predisposed to, said 
disease, wherein an alteration in the expression level of the polypq)tide in the 
first subject as compared to the control sample indicates the presence of or 
predisposition to said disease. 

41. A method for determimng the presence of or predisposition to a disease associated with 
altered levels of the nucleic acid molecule of claim 5 in a first mammalian subject, the 
method compimag: 

a) measuring the amount of the nucleic acid m a sample fiom the first mammaUan 
subject; and 

b) comparing the amount of said nucleic acid m the sample of step (a) to the 
amount of the nucleic acid present m a control sample fiom a second 
mammalian subject known not to have or not be predisposed to, the disease; 
wherem an alteration in the level of the nucleic acid in the first subject as 
compared to the control sample indicates the presence of or predisposition to 
the disease. 
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A method of treating a pathological state in a mammal, Hhe mefliod comprising 
administering to the mammal a polypeptide in an amount that is sufficient to aUeviate 
the pathological state, wherein the polypqjtide is a polypeptide having ao amino add 
sequence at least 95% identical to apolypeptide conqjrising the amino add sequence 
selected fix)m the group consisting of SEQ ID NO: 2, 4, 5, 7, 9, 11, 13, 15, 17, 19, 21, 
or 23 or a biologically active fragment thereof 

A method of treating a pafeological state in a mammal, the mefliod conq)rising 
administering to the mammal the antibody of claim 15 in an amount sufSdent to 
alleviate the pathological state. 
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